Hi,
I’m having a similar issue, were you able to solve it?
Thanks.
Hey all,
I've got a strange problem going on here. I've installed glusterfs-server
on ubuntu 16.04:
glusterfs-client/xenial,now 3.7.6-1ubuntu1 amd64 [installed,automatic]
glusterfs-common/xenial,now 3.7.6-1ubuntu1 amd64 [installed,automatic]
glusterfs-server/xenial,now 3.7.6-1ubuntu1 amd64 [installed]
I can successfully probe another peer at this point. Then, after installing
kubernetes via kargo, peer probing begins failing with a timeout. I've
tried stopping all kubernetes related services, and flushing all iptables
rules, however I don't see any packets leaving any interface when
attempting to peer probe.
from cli.log:
[2017-04-03 22:20:24.704900] I [MSGID: 101190]
[event-epoll.c:632:event_
dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-04-03 22:20:24.704973] T [cli.c:273:cli_rpc_notify] 0-glusterfs: got
RPC_CLNT_CONNECT
[2017-04-03 22:20:24.705001] T [cli-quotad-client.c:94:cli_
quotad_notify] 0-glusterfs: got RPC_CLNT_CONNECT
[2017-04-03 22:20:24.705014] I [socket.c:2355:socket_event_
handler] 0-transport: disconnecting now
[2017-04-03 22:20:24.705204] T [rpc-clnt.c:1404:rpc_clnt_
record] 0-glusterfs: Auth Info: pid: 0, uid: 0, gid: 0, owner:
[2017-04-03 22:20:24.705256] T
[rpc-clnt.c:1261:rpc_clnt_
record_build_header] 0-rpc-clnt: Request fraglen 156, payload: 92, rpc hdr: 64
[2017-04-03 22:20:24.705662] T [socket.c:2879:socket_connect] (-->
/usr/lib/x86_64-linux-gnu/
libglusterfs.so.0(_gf_log_ callingfn+0x1a3)[ 0x7f012fd21953] (--> /usr/lib/x86_64-linux-gnu
/libgfrpc.so.0(rpc_clnt_
remove_ping_timer_locked+0x84) [0x7f012f69add4] (--> /usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_clnt_ connection_cleanup+0x55)[ 0x7f012f697af5] (--> /usr/lib/x8
6_64-linux-gnu/libgfrpc.so.0(
rpc_clnt_notify+0x88)[ 0x7f012f698338] (--> /usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_transport_ notify+0x23)[0x7f012f6945b3] ))))) 0-glusterfs: connect
() called on transport already connected
[2017-04-03 22:20:24.705680] D
[rpc-clnt-ping.c:98:rpc_clnt_
remove_ping_timer_locked] (--> /usr/lib/x86_64-linux-gnu/
libglusterfs.so.0(_gf_log_ callingfn+0x1a3)[ 0x7f012fd21953] (--> /
usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_clnt_remove_ ping_timer_locked+0x84)[ 0x7f012f69add4] (-->
/usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_clnt_ connection_cleanup+0x55)[ 0x7f012f 697af5] (-->
/usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_clnt_notify+ 0x88)[0x7f012f698338] (-->
/usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_transport_ notify+0x23)[0x7f012f6945b3] )))
)) 0-: /var/run/gluster/quotad.
socket: ping timer event already removed [2017-04-03 22:20:24.705710] T [cli-quotad-client.c:100:cli_
quotad_notify] 0-glusterfs: got RPC_CLNT_DISCONNECT
[2017-04-03 22:20:24.705718] T [rpc-clnt.c:1598:rpc_clnt_
submit] 0-rpc-clnt: submitted request (XID: 0x1 Program: Gluster CLI, ProgVers: 2,
Proc: 1) to rpc-transport (glusterfs)
[2017-04-03 22:20:24.705739] D [rpc-clnt-ping.c:281:rpc_clnt_
start_ping] 0-glusterfs: ping timeout is 0, returning
[2017-04-03 22:20:24.705723] D [MSGID: 0]
[event-epoll.c:591:event_
dispatch_epoll_handler] 0-epoll: generation bumped on idx=1 from gen=1 to slot->gen=2, fd=7, slot->fd=7
[2017-04-03 22:20:27.614881] T [rpc-clnt.c:418:rpc_clnt_
reconnect] 0-glusterfs: attempting reconnect
[2017-04-03 22:20:27.615151] T [socket.c:2879:socket_connect] (-->
/usr/lib/x86_64-linux-gnu/
libglusterfs.so.0(_gf_log_ callingfn+0x1a3)[ 0x7f012fd21953] (-->
/usr/lib/x86_64-linux-gnu/
glusterfs/3.7.6/rpc-transport/ socket.so(+0x6c1b)[ 0x7f012a697c1b] (-->
/usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_clnt_ reconnect+0xb9)[ 0x7f012f695999] (-->
/usr/lib/x86_64-linux-gnu/
libglusterfs.so.0(gf_timer_ proc+0xfc)[0x7f012fd3d70c] (--> /lib/x86_64-linux-gnu/
libpthread.so.0(+0x76ba)[ 0x7f012f0b86ba] ))))) 0-glusterfs: connect () called on transport already connected
it then repeats the following:
[2017-04-03 22:20:27.615177] T [rpc-clnt.c:418:rpc_clnt_
reconnect] 0-glusterfs: attempting reconnect
[2017-04-03 22:20:27.615188] T [socket.c:2887:socket_connect] 0-glusterfs:
connecting 0x25d3550, state=0 gen=0 sock=-1
[2017-04-03 22:20:27.615200] T
[name.c:295:af_unix_client_
get_remote_sockaddr] 0-glusterfs: using connect-path /var/run/gluster/quotad.socket
[2017-04-03 22:20:27.615218] T [name.c:111:af_unix_client_
bind] 0-glusterfs: bind-path not specified for unix socket, letting connect to
assign default value
[2017-04-03 22:20:27.615329] T [cli-quotad-client.c:94:cli_
quotad_notify] 0-glusterfs: got RPC_CLNT_CONNECT
[2017-04-03 22:20:27.615355] I [socket.c:2355:socket_event_
handler] 0-transport: disconnecting now
[2017-04-03 22:20:27.615567] D
[rpc-clnt-ping.c:98:rpc_clnt_
remove_ping_timer_locked] (--> /usr/lib/x86_64-linux-gnu/
libglusterfs.so.0(_gf_log_ callingfn+0x1a3)[ 0x7f012fd21953] (-->
/usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_clnt_remove_ ping_timer_locked+0x84)[ 0x7f012f69add4] (-->
/usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_clnt_ connection_cleanup+0x55)[ 0x7f012f697af5] (-->
/usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_clnt_notify+ 0x88)[0x7f012f698338] (-->
/usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_transport_ notify+0x23)[0x7f012f6945b3] ))))) 0-: /var/run/gluster/quotad.
socket: ping timer event already removed [2017-04-03 22:20:27.615598] T [cli-quotad-client.c:100:cli_
quotad_notify] 0-glusterfs: got RPC_CLNT_DISCONNECT
[2017-04-03 22:20:27.615612] D [MSGID: 0]
[event-epoll.c:591:event_
dispatch_epoll_handler] 0-epoll: generation bumped on idx=1 from gen=4 to slot->gen=5, fd=7, slot->fd=7
[2017-04-03 22:20:30.615864] T [rpc-clnt.c:418:rpc_clnt_
reconnect] 0-glusterfs: attempting reconnect
[2017-04-03 22:20:30.616113] T [socket.c:2879:socket_connect] (-->
/usr/lib/x86_64-linux-gnu/
libglusterfs.so.0(_gf_log_ callingfn+0x1a3)[ 0x7f012fd21953] (-->
/usr/lib/x86_64-linux-gnu/
glusterfs/3.7.6/rpc-transport/ socket.so(+0x6c1b)[ 0x7f012a697c1b] (-->
/usr/lib/x86_64-linux-gnu/
libgfrpc.so.0(rpc_clnt_ reconnect+0xb9)[ 0x7f012f695999] (-->
/usr/lib/x86_64-linux-gnu/
libglusterfs.so.0(gf_timer_ proc+0xfc)[0x7f012fd3d70c] (--> /lib/x86_64-linux-gnu/
libpthread.so.0(+0x76ba)[ 0x7f012f0b86ba] ))))) 0-glusterfs: connect () called on transport already connected
and from etc-glusterfs-glusterd.vol.log i see the probe request and then:
[2017-04-03 22:20:24.707258] D [MSGID: 0]
[glusterd-peer-utils.c:133:
glusterd_peerinfo_find_by_ hostname] 0-management: Unable to find friend: 10.1.0.7
[2017-04-03 22:20:24.707260] D [MSGID: 0]
[glusterd-peer-utils.c:247:
glusterd_peerinfo_find] 0-management: Unable to find hostname: 10.1.0.7
[2017-04-03 22:20:24.707293] I [MSGID: 106129]
[glusterd-handler.c:3600:
glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: 10.1.0.7 (24007)
[2017-04-03 22:20:24.707314] D [MSGID: 0]
[glusterd-peer-utils.c:458:
glusterd_peer_hostname_new] 0-glusterd: Returning 0
[2017-04-03 22:20:24.707329] D [MSGID: 0]
[glusterd-utils.c:5974:
glusterd_sm_tr_log_init] 0-glusterd: returning 0 [2017-04-03 22:20:24.707353] D [MSGID: 0]
[glusterd-store.c:3802:
glusterd_store_create_peer_ dir] 0-glusterd: Returning with 0
[2017-04-03 22:20:24.728550] D [MSGID: 0] [store.c:421:gf_store_handle_
new] 0-: Returning 0
[2017-04-03 22:20:24.728679] D [MSGID: 0] [store.c:375:gf_store_save_
value] 0-management: returning: 0
[2017-04-03 22:20:24.728720] D [logging.c:1952:_gf_msg_
internal] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About
to flush least recently used log message to disk
The message "D [MSGID: 0] [store.c:375:gf_store_save_
value] 0-management: returning: 0" repeated 2 times between [2017-04-03 22:20:24.728679] and
[2017-04-03 22:20:24.728718]
[2017-04-03 22:20:24.728720] D [MSGID: 0]
[glusterd-store.c:3931:
glusterd_store_peer_write] 0-glusterd: Returning with 0
[2017-04-03 22:20:24.786973] D [MSGID: 0]
[glusterd-store.c:3957:
glusterd_store_perform_peer_ store] 0-glusterd: Returning 0
[2017-04-03 22:20:24.787007] D [MSGID: 0]
[glusterd-store.c:3978:
glusterd_store_peerinfo] 0-glusterd: Returning with 0 [2017-04-03 22:20:24.787046] D [MSGID: 0]
[glusterd-handler.c:3391:
glusterd_transport_inet_ options_build] 0-glusterd: Returning 0
[2017-04-03 22:20:24.787077] I [rpc-clnt.c:984:rpc_clnt_
connection_init] 0-management: setting frame-timeout to 600
[2017-04-03 22:20:24.787090] D [rpc-clnt.c:996:rpc_clnt_
connection_init] 0-management: setting ping-timeout to 30
[2017-04-03 22:20:24.787106] D [rpc-transport.c:288:rpc_
transport_load] 0-rpc-transport: attempt to load file
/usr/lib/x86_64-linux-gnu/
glusterfs/3.7.6/rpc-transport/ socket.so [2017-04-03 22:20:24.787147] T [MSGID: 0]
[options.c:91:xlator_option_
validate_int] 0-management: no range check required for 'option transport.tcp-user-timeout -1'
[2017-04-03 22:20:24.787166] T [MSGID: 0]
[options.c:91:xlator_option_
validate_int] 0-management: no range check required for 'option transport.socket.keepalive-
time 10' [2017-04-03 22:20:24.787182] T [MSGID: 0]
[options.c:91:xlator_option_
validate_int] 0-management: no range check required for 'option transport.socket.keepalive-
interval 2' [2017-04-03 22:20:24.787203] T [MSGID: 0]
[options.c:91:xlator_option_
validate_int] 0-management: no range check required for 'option remote-port 24007'
[2017-04-03 22:20:24.787221] D [socket.c:3845:socket_init] 0-management:
Configued transport.tcp-user-timeout=-1
[2017-04-03 22:20:24.787233] D [socket.c:3928:socket_init] 0-management:
SSL support on the I/O path is NOT enabled
[2017-04-03 22:20:24.787242] D [socket.c:3931:socket_init] 0-management:
SSL support for glusterd is NOT enabled
[2017-04-03 22:20:24.787251] D [socket.c:3948:socket_init] 0-management:
using system polling thread
[2017-04-03 22:20:24.787263] T [rpc-clnt.c:418:rpc_clnt_
reconnect] 0-management: attempting reconnect
[2017-04-03 22:20:24.787273] T [socket.c:2887:socket_connect] 0-management:
connecting 0x7ff7fc005640, state=0 gen=0 sock=-1
[2017-04-03 22:20:24.787288] T [MSGID: 0]
[common-utils.c:290:gf_
resolve_ip6] 0-resolver: DNS cache not present, freshly probing hostname: 10.1.0.7
[2017-04-03 22:20:24.787843] D [MSGID: 0]
[common-utils.c:333:gf_
resolve_ip6] 0-resolver: returning ip-10.1.0.7 (port-24007) for hostname: 10.1.0.7 and port: 24007
[2017-04-03 22:20:24.787863] D [socket.c:2833:socket_fix_ssl_
opts] 0-management: disabling SSL for portmapper connection
[2017-04-03 22:20:24.787883] T [socket.c:805:__socket_
nodelay] 0-management: NODELAY enabled for socket 8
[2017-04-03 22:20:24.787899] W [socket.c:869:__socket_
keepalive] 0-socket: failed to set TCP_USER_TIMEOUT -1000 on socket 8, Invalid argument
[2017-04-03 22:20:24.787910] E [socket.c:2965:socket_connect] 0-management:
Failed to set keep-alive: Invalid argument
[2017-04-03 22:20:24.788092] D [MSGID: 0]
[common-utils.c:2887:gf_ports_
reserved] 0-glusterfs: lower: 30000, higher: 32767
[2017-04-03 22:21:52.911012] D [logging.c:1828:gf_log_flush_
timeout_cbk] 0-logging-infra: Log timer timed out. About to flush outstanding messages
if present
[2017-04-03 22:21:52.911131] D [logging.c:1790:__gf_log_
inject_timer_event] 0-logging-infra: Starting timer now. Timeout = 120, current buf size = 5
[2017-04-03 22:22:24.707002] D [socket.c:558:__socket_rwv]
0-socket.management: EOF on socket
[2017-04-03 22:22:24.707032] D [socket.c:2355:socket_event_
handler] 0-transport: disconnecting now
I've debugged this to the best of my ability but haven't made any progress
narrowing it down, nor has google turned up anything useful.
Thanks for the help!
Ken
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users