Hence glusterd was crashing with SIGSEGV.
GauravHi,I have tried on my host by setting corresponding ports, but I didn't see the issue on my machine locally.However with the logs you have sent it is prety much clear issue is related to ports only.I will trying to reproduce on some other machine. Will update you as s0on as possible.ThanksOn Sun, Jun 18, 2017 at 12:37 PM, Guy Cukierman <guyc@xxxxxxxxxxx> wrote:Hi,
Below please find the reserved ports and log, thanks.
sysctl net.ipv4.ip_local_reserved_por
ts: net.ipv4.ip_local_reserved_por
ts = 30000-32767
glusterd.log:
[2017-06-18 07:04:17.853162] I [MSGID: 106487] [glusterd-handler.c:1242:__glu
sterd_handle_cli_probe] 0-glusterd: Received CLI probe req 192.168.1.17 24007 [2017-06-18 07:04:17.853237] D [MSGID: 0] [common-utils.c:3361:gf_is_loc
al_addr] 0-management: 192.168.1.17 [2017-06-18 07:04:17.854093] D [logging.c:1952:_gf_msg_intern
al] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to flush least recently used log message to disk The message "D [MSGID: 0] [common-utils.c:3361:gf_is_loc
al_addr] 0-management: 192.168.1.17 " repeated 2 times between [2017-06-18 07:04:17.853237] and [2017-06-18 07:04:17.853869] [2017-06-18 07:04:17.854093] D [MSGID: 0] [common-utils.c:3377:gf_is_loc
al_addr] 0-management: 192.168.1.17 is not local [2017-06-18 07:04:17.854221] D [MSGID: 0] [glusterd-peer-utils.c:132:glu
sterd_peerinfo_find_by_hostnam e] 0-management: Unable to find friend: 192.168.1.17 [2017-06-18 07:04:17.854271] D [logging.c:1952:_gf_msg_intern
al] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to flush least recently used log message to disk [2017-06-18 07:04:17.854269] D [MSGID: 0] [glusterd-peer-utils.c:132:glu
sterd_peerinfo_find_by_hostnam e] 0-management: Unable to find friend: 192.168.1.17 [2017-06-18 07:04:17.854271] D [MSGID: 0] [glusterd-peer-utils.c:246:glu
sterd_peerinfo_find] 0-management: Unable to find hostname: 192.168.1.17 [2017-06-18 07:04:17.854306] I [MSGID: 106129] [glusterd-handler.c:3690:glust
erd_probe_begin] 0-glusterd: Unable to find peerinfo for host: 192.168.1.17 (24007) [2017-06-18 07:04:17.854343] D [MSGID: 0] [glusterd-peer-utils.c:486:glu
sterd_peer_hostname_new] 0-glusterd: Returning 0 [2017-06-18 07:04:17.854367] D [MSGID: 0] [glusterd-utils.c:7060:gluster
d_sm_tr_log_init] 0-glusterd: returning 0 [2017-06-18 07:04:17.854387] D [MSGID: 0] [glusterd-store.c:4092:gluster
d_store_create_peer_dir] 0-glusterd: Returning with 0 [2017-06-18 07:04:17.854918] D [MSGID: 0] [store.c:420:gf_store_handle_n
ew] 0-: Returning 0 [2017-06-18 07:04:17.855083] D [MSGID: 0] [store.c:374:gf_store_save_val
ue] 0-management: returning: 0 [2017-06-18 07:04:17.855130] D [logging.c:1952:_gf_msg_intern
al] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to flush least recently used log message to disk The message "D [MSGID: 0] [store.c:374:gf_store_save_val
ue] 0-management: returning: 0" repeated 2 times between [2017-06-18 07:04:17.855083] and [2017-06-18 07:04:17.855128] [2017-06-18 07:04:17.855129] D [MSGID: 0] [glusterd-store.c:4221:gluster
d_store_peer_write] 0-glusterd: Returning with 0 [2017-06-18 07:04:17.856294] D [MSGID: 0] [glusterd-store.c:4247:gluster
d_store_perform_peer_store] 0-glusterd: Returning 0 [2017-06-18 07:04:17.856332] D [MSGID: 0] [glusterd-store.c:4268:gluster
d_store_peerinfo] 0-glusterd: Returning with 0 [2017-06-18 07:04:17.856365] W [MSGID: 106062] [glusterd-handler.c:3466:glust
erd_transport_inet_options_ build] 0-glusterd: Failed to get tcp-user-timeout [2017-06-18 07:04:17.856387] D [MSGID: 0] [glusterd-handler.c:3474:glust
erd_transport_inet_options_ build] 0-glusterd: Returning 0 [2017-06-18 07:04:17.856409] I [rpc-clnt.c:1059:rpc_clnt_conn
ection_init] 0-management: setting frame-timeout to 600 [2017-06-18 07:04:17.856421] D [rpc-clnt.c:1071:rpc_clnt_conn
ection_init] 0-management: setting ping-timeout to 30 [2017-06-18 07:04:17.856434] D [rpc-transport.c:279:rpc_trans
port_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/3.10.3/rp c-transport/socket.so [2017-06-18 07:04:17.856580] D [socket.c:4082:socket_init] 0-management: Configued transport.tcp-user-timeout=-1
[2017-06-18 07:04:17.856594] D [socket.c:4165:socket_init] 0-management: SSL support on the I/O path is NOT enabled
[2017-06-18 07:04:17.856625] D [socket.c:4168:socket_init] 0-management: SSL support for glusterd is NOT enabled
[2017-06-18 07:04:17.856634] D [socket.c:4185:socket_init] 0-management: using system polling thread
[2017-06-18 07:04:17.856664] D [name.c:168:client_fill_addres
s_family] 0-management: address-family not specified, marking it as unspec for getaddrinfo to resolve from (remote-host: 192.168.1.17) [2017-06-18 07:04:17.861800] D [MSGID: 0] [common-utils.c:334:gf_resolve
_ip6] 0-resolver: returning ip-192.168.1.17 (port-24007) for hostname: 192.168.1.17 and port: 24007 [2017-06-18 07:04:17.861830] D [socket.c:2982:socket_fix_ssl_
opts] 0-management: disabling SSL for portmapper connection [2017-06-18 07:04:17.861885] D [MSGID: 0] [common-utils.c:3106:gf_ports_
reserved] 0-glusterfs: lower: 30000, higher: 32767 [2017-06-18 07:04:17.861920] D [logging.c:1764:gf_log_flush_e
xtra_msgs] 0-logging-infra: Log buffer size reduced. About to flush 5 extra log messages [2017-06-18 07:04:17.861933] D [logging.c:1767:gf_log_flush_e
xtra_msgs] 0-logging-infra: Just flushed 5 extra log messages pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterf
s.git signal received: 11
time of crash:
2017-06-18 07:04:17
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.10.3
/lib64/libglusterfs.so.0(_gf_m
sg_backtrace_nomem+0xa0)[0x7fb df7c964d0] /lib64/libglusterfs.so.0(gf_pr
int_trace+0x324)[0x7fbdf7c9fdd 4] /lib64/libc.so.6(+0x35250)[0x7
fbdf637a250] /lib64/libglusterfs.so.0(gf_po
rts_reserved+0x15c)[0x7fbdf7ca 044c] /lib64/libglusterfs.so.0(gf_pr
ocess_reserved_ports+0xbe)[0x7 fbdf7ca070e] /usr/lib64/glusterfs/3.10.3/rp
c-transport/socket.so(+0xd158) [0x7fbde9c24158] /usr/lib64/glusterfs/3.10.3/rp
c-transport/socket.so(client_ bind+0x93)[0x7fbde9c245a3] /usr/lib64/glusterfs/3.10.3/rp
c-transport/socket.so(+0xa875) [0x7fbde9c21875] /lib64/libgfrpc.so.0(rpc_clnt_
reconnect+0xc9)[0x7fbdf7a5ff89 ] /lib64/libgfrpc.so.0(rpc_clnt_
start+0x39)[0x7fbdf7a60049] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x24218 )[0x7fbdec7b5218] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x24843 )[0x7fbdec7b5843] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x24ae0 )[0x7fbdec7b5ae0] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x27890 )[0x7fbdec7b8890] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x27e20 )[0x7fbdec7b8e20] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x20f5e )[0x7fbdec7b1f5e] /lib64/libglusterfs.so.0(synct
ask_wrap+0x10)[0x7fbdf7ccd750] /lib64/libc.so.6(+0x46cf0)[0x7
fbdf638bcf0] ---------
From: Gaurav Yadav [mailto:gyadav@xxxxxxxxxx]
Sent: Friday, June 16, 2017 5:47 AM
To: Atin Mukherjee <amukherj@xxxxxxxxxx>
Cc: Guy Cukierman <guyc@xxxxxxxxxxx>; gluster-users@xxxxxxxxxxx
Subject: Re: gluster peer probe failing
Could you please send me the output of command "sysctl net.ipv4.ip_local_reserved_por
ts". Apart from output of command please send the logs to look into the issue.
Thanks
Gaurav
On Thu, Jun 15, 2017 at 4:28 PM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:
+Gaurav, he is the author of the patch, can you please comment here?
On Thu, Jun 15, 2017 at 3:28 PM, Guy Cukierman <guyc@xxxxxxxxxxx> wrote:
Thanks, but my current settings are:
net.ipv4.ip_local_reserved_por
ts = 30000-32767 net.ipv4.ip_local_port_range = 32768 60999
meaning the reserved ports are already in the short int range, so maybe I misunderstood something? or is it a different issue?
From: Atin Mukherjee [mailto:amukherj@xxxxxxxxxx]
Sent: Thursday, June 15, 2017 10:56 AM
To: Guy Cukierman <guyc@xxxxxxxxxxx>
Cc: gluster-users@xxxxxxxxxxx
Subject: Re: gluster peer probe failing
https://review.gluster.org/#/c
/17494/ will it and the next update of 3.10 should have this fix.If sysctl net.ipv4.ip_local_reserved_ports has any value > short int range then this would be a problem with the current version.
Would you be able to reset the reserved ports temporarily to get this going?
On Wed, Jun 14, 2017 at 8:32 PM, Guy Cukierman <guyc@xxxxxxxxxxx> wrote:
Hi,
I have a gluster (version 3.10.2) server running on a 3 node (centos7) cluster.
Firewalld and SELinux are disabled, and I see I can telnet from each node to the other on port 24007.
When I try to create the first peering by running on node1 the command:
gluster peer probe <node2 ip address>
I get the error:
“Connection failed. Please check if gluster daemon is operational.”
And Glusterd.log shows:
[2017-06-14 14:46:09.927510] I [MSGID: 106487] [glusterd-handler.c:1242:__glu
sterd_handle_cli_probe] 0-glusterd: Received CLI probe req 192.168.1.17 24007 [2017-06-14 14:46:09.928560] I [MSGID: 106129] [glusterd-handler.c:3690:glust
erd_probe_begin] 0-glusterd: Unable to find peerinfo for host: 192.168.1.17 (24007) [2017-06-14 14:46:09.930783] W [MSGID: 106062] [glusterd-handler.c:3466:glust
erd_transport_inet_options_ build] 0-glusterd: Failed to get tcp-user-timeout [2017-06-14 14:46:09.930837] I [rpc-clnt.c:1059:rpc_clnt_conn
ection_init] 0-management: setting frame-timeout to 600 pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.org/glusterf
s.git signal received: 11
time of crash:
2017-06-14 14:46:09
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.10.3
/lib64/libglusterfs.so.0(_gf_m
sg_backtrace_nomem+0xa0)[0x7f6 9625da4d0] /lib64/libglusterfs.so.0(gf_pr
int_trace+0x324)[0x7f69625e3dd 4] /lib64/libc.so.6(+0x35250)[0x7
f6960cbe250] /lib64/libglusterfs.so.0(gf_po
rts_reserved+0x15c)[0x7f69625e 444c] /lib64/libglusterfs.so.0(gf_pr
ocess_reserved_ports+0xbe)[0x7 f69625e470e] /usr/lib64/glusterfs/3.10.3/rp
c-transport/socket.so(+0xd158) [0x7f6954568158] /usr/lib64/glusterfs/3.10.3/rp
c-transport/socket.so(client_ bind+0x93)[0x7f69545685a3] /usr/lib64/glusterfs/3.10.3/rp
c-transport/socket.so(+0xa875) [0x7f6954565875] /lib64/libgfrpc.so.0(rpc_clnt_
reconnect+0xc9)[0x7f69623a3f89 ] /lib64/libgfrpc.so.0(rpc_clnt_
start+0x39)[0x7f69623a4049] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x24218 )[0x7f69570f9218] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x24843 )[0x7f69570f9843] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x24ae0 )[0x7f69570f9ae0] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x27890 )[0x7f69570fc890] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x27e20 )[0x7f69570fce20] /usr/lib64/glusterfs/3.10.3/xl
ator/mgmt/glusterd.so(+0x20f5e )[0x7f69570f5f5e] /lib64/libglusterfs.so.0(synct
ask_wrap+0x10)[0x7f6962611750] /lib64/libc.so.6(+0x46cf0)[0x7
f6960ccfcf0]
And a file is create under /var/lib/glusterd/peers/<node2 ip address> which contains:
uuid=00000000-0000-0000-0000-0
00000000000 state=0
hostname1=192.168.1.17
and the glusterd daemon exits and I cannot restart it until I delete this file from the peers folder.
Any idea what is wrong?
thanks!
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users