Any recommended workaround? - probably you need to wipe off the ip reserved local ports file.
Thanks Gaurav!
- Any time estimation on to when this fix would be released?
- Any recommended workaround?
Best,
Guy.
From: Gaurav Yadav [mailto:gyadav@xxxxxxxxxx]
Sent: Tuesday, June 20, 2017 9:46 AM
To: Guy Cukierman <guyc@xxxxxxxxxxx>
Cc: Atin Mukherjee <amukherj@xxxxxxxxxx>; gluster-users@xxxxxxxxxxx
Subject: Re: gluster peer probe failing
Hi,
I am able to recreate the issue and here is my RCA.
Maximum value i.e 32767 is being overflowed while doing manipulation on it and it was previously not taken care properly.
Hence glusterd was crashing with SIGSEGV.Issue is being fixed with "https://bugzilla.redhat.com/
show_bug.cgi?id=1454418 " and being backported as well.
Thanks
Gaurav
On Tue, Jun 20, 2017 at 6:43 AM, Gaurav Yadav <gyadav@xxxxxxxxxx> wrote:
Hi,
I have tried on my host by setting corresponding ports, but I didn't see the issue on my machine locally.
However with the logs you have sent it is prety much clear issue is related to ports only.
I will trying to reproduce on some other machine. Will update you as s0on as possible.
Thanks
Gaurav
On Sun, Jun 18, 2017 at 12:37 PM, Guy Cukierman <guyc@xxxxxxxxxxx> wrote:
Hi,
Below please find the reserved ports and log, thanks.
sysctl net.ipv4.ip_local_reserved_
ports: net.ipv4.ip_local_reserved_
ports = 30000-32767
glusterd.log:
[2017-06-18 07:04:17.853162] I [MSGID: 106487] [glusterd-handler.c:1242:__
glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req 192.168.1.17 24007 [2017-06-18 07:04:17.853237] D [MSGID: 0] [common-utils.c:3361:gf_is_
local_addr] 0-management: 192.168.1.17 [2017-06-18 07:04:17.854093] D [logging.c:1952:_gf_msg_
internal] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to flush least recently used log message to disk The message "D [MSGID: 0] [common-utils.c:3361:gf_is_
local_addr] 0-management: 192.168.1.17 " repeated 2 times between [2017-06-18 07:04:17.853237] and [2017-06-18 07:04:17.853869] [2017-06-18 07:04:17.854093] D [MSGID: 0] [common-utils.c:3377:gf_is_
local_addr] 0-management: 192.168.1.17 is not local [2017-06-18 07:04:17.854221] D [MSGID: 0] [glusterd-peer-utils.c:132:
glusterd_peerinfo_find_by_ hostname] 0-management: Unable to find friend: 192.168.1.17 [2017-06-18 07:04:17.854271] D [logging.c:1952:_gf_msg_
internal] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to flush least recently used log message to disk [2017-06-18 07:04:17.854269] D [MSGID: 0] [glusterd-peer-utils.c:132:
glusterd_peerinfo_find_by_ hostname] 0-management: Unable to find friend: 192.168.1.17 [2017-06-18 07:04:17.854271] D [MSGID: 0] [glusterd-peer-utils.c:246:
glusterd_peerinfo_find] 0-management: Unable to find hostname: 192.168.1.17 [2017-06-18 07:04:17.854306] I [MSGID: 106129] [glusterd-handler.c:3690:
glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: 192.168.1.17 (24007) [2017-06-18 07:04:17.854343] D [MSGID: 0] [glusterd-peer-utils.c:486:
glusterd_peer_hostname_new] 0-glusterd: Returning 0 [2017-06-18 07:04:17.854367] D [MSGID: 0] [glusterd-utils.c:7060:
glusterd_sm_tr_log_init] 0-glusterd: returning 0 [2017-06-18 07:04:17.854387] D [MSGID: 0] [glusterd-store.c:4092:
glusterd_store_create_peer_ dir] 0-glusterd: Returning with 0 [2017-06-18 07:04:17.854918] D [MSGID: 0] [store.c:420:gf_store_handle_
new] 0-: Returning 0 [2017-06-18 07:04:17.855083] D [MSGID: 0] [store.c:374:gf_store_save_
value] 0-management: returning: 0 [2017-06-18 07:04:17.855130] D [logging.c:1952:_gf_msg_
internal] 0-logging-infra: Buffer overflow of a buffer whose size limit is 5. About to flush least recently used log message to disk The message "D [MSGID: 0] [store.c:374:gf_store_save_
value] 0-management: returning: 0" repeated 2 times between [2017-06-18 07:04:17.855083] and [2017-06-18 07:04:17.855128] [2017-06-18 07:04:17.855129] D [MSGID: 0] [glusterd-store.c:4221:
glusterd_store_peer_write] 0-glusterd: Returning with 0 [2017-06-18 07:04:17.856294] D [MSGID: 0] [glusterd-store.c:4247:
glusterd_store_perform_peer_ store] 0-glusterd: Returning 0 [2017-06-18 07:04:17.856332] D [MSGID: 0] [glusterd-store.c:4268:
glusterd_store_peerinfo] 0-glusterd: Returning with 0 [2017-06-18 07:04:17.856365] W [MSGID: 106062] [glusterd-handler.c:3466:
glusterd_transport_inet_ options_build] 0-glusterd: Failed to get tcp-user-timeout [2017-06-18 07:04:17.856387] D [MSGID: 0] [glusterd-handler.c:3474:
glusterd_transport_inet_ options_build] 0-glusterd: Returning 0 [2017-06-18 07:04:17.856409] I [rpc-clnt.c:1059:rpc_clnt_
connection_init] 0-management: setting frame-timeout to 600 [2017-06-18 07:04:17.856421] D [rpc-clnt.c:1071:rpc_clnt_
connection_init] 0-management: setting ping-timeout to 30 [2017-06-18 07:04:17.856434] D [rpc-transport.c:279:rpc_
transport_load] 0-rpc-transport: attempt to load file /usr/lib64/glusterfs/3.10.3/ rpc-transport/socket.so [2017-06-18 07:04:17.856580] D [socket.c:4082:socket_init] 0-management: Configued transport.tcp-user-timeout=-1
[2017-06-18 07:04:17.856594] D [socket.c:4165:socket_init] 0-management: SSL support on the I/O path is NOT enabled
[2017-06-18 07:04:17.856625] D [socket.c:4168:socket_init] 0-management: SSL support for glusterd is NOT enabled
[2017-06-18 07:04:17.856634] D [socket.c:4185:socket_init] 0-management: using system polling thread
[2017-06-18 07:04:17.856664] D [name.c:168:client_fill_
address_family] 0-management: address-family not specified, marking it as unspec for getaddrinfo to resolve from (remote-host: 192.168.1.17) [2017-06-18 07:04:17.861800] D [MSGID: 0] [common-utils.c:334:gf_
resolve_ip6] 0-resolver: returning ip-192.168.1.17 (port-24007) for hostname: 192.168.1.17 and port: 24007 [2017-06-18 07:04:17.861830] D [socket.c:2982:socket_fix_ssl_
opts] 0-management: disabling SSL for portmapper connection [2017-06-18 07:04:17.861885] D [MSGID: 0] [common-utils.c:3106:gf_ports_
reserved] 0-glusterfs: lower: 30000, higher: 32767 [2017-06-18 07:04:17.861920] D [logging.c:1764:gf_log_flush_
extra_msgs] 0-logging-infra: Log buffer size reduced. About to flush 5 extra log messages [2017-06-18 07:04:17.861933] D [logging.c:1767:gf_log_flush_
extra_msgs] 0-logging-infra: Just flushed 5 extra log messages pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.org/
glusterfs.git signal received: 11
time of crash:
2017-06-18 07:04:17
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.10.3
/lib64/libglusterfs.so.0(_gf_
msg_backtrace_nomem+0xa0)[ 0x7fbdf7c964d0] /lib64/libglusterfs.so.0(gf_
print_trace+0x324)[ 0x7fbdf7c9fdd4] /lib64/libc.so.6(+0x35250)[
0x7fbdf637a250] /lib64/libglusterfs.so.0(gf_
ports_reserved+0x15c)[ 0x7fbdf7ca044c] /lib64/libglusterfs.so.0(gf_
process_reserved_ports+0xbe)[ 0x7fbdf7ca070e] /usr/lib64/glusterfs/3.10.3/
rpc-transport/socket.so(+ 0xd158)[0x7fbde9c24158] /usr/lib64/glusterfs/3.10.3/
rpc-transport/socket.so( client_bind+0x93)[ 0x7fbde9c245a3] /usr/lib64/glusterfs/3.10.3/
rpc-transport/socket.so(+ 0xa875)[0x7fbde9c21875] /lib64/libgfrpc.so.0(rpc_clnt_
reconnect+0xc9)[ 0x7fbdf7a5ff89] /lib64/libgfrpc.so.0(rpc_clnt_
start+0x39)[0x7fbdf7a60049] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x24218)[0x7fbdec7b5218] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x24843)[0x7fbdec7b5843] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x24ae0)[0x7fbdec7b5ae0] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x27890)[0x7fbdec7b8890] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x27e20)[0x7fbdec7b8e20] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x20f5e)[0x7fbdec7b1f5e] /lib64/libglusterfs.so.0(
synctask_wrap+0x10)[ 0x7fbdf7ccd750] /lib64/libc.so.6(+0x46cf0)[
0x7fbdf638bcf0] ---------
From: Gaurav Yadav [mailto:gyadav@xxxxxxxxxx]
Sent: Friday, June 16, 2017 5:47 AM
To: Atin Mukherjee <amukherj@xxxxxxxxxx>
Cc: Guy Cukierman <guyc@xxxxxxxxxxx>; gluster-users@xxxxxxxxxxx
Subject: Re: gluster peer probe failing
Could you please send me the output of command "sysctl net.ipv4.ip_local_reserved_
ports". Apart from output of command please send the logs to look into the issue.
Thanks
Gaurav
On Thu, Jun 15, 2017 at 4:28 PM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:
+Gaurav, he is the author of the patch, can you please comment here?
On Thu, Jun 15, 2017 at 3:28 PM, Guy Cukierman <guyc@xxxxxxxxxxx> wrote:
Thanks, but my current settings are:
net.ipv4.ip_local_reserved_
ports = 30000-32767 net.ipv4.ip_local_port_range = 32768 60999
meaning the reserved ports are already in the short int range, so maybe I misunderstood something? or is it a different issue?
From: Atin Mukherjee [mailto:amukherj@xxxxxxxxxx]
Sent: Thursday, June 15, 2017 10:56 AM
To: Guy Cukierman <guyc@xxxxxxxxxxx>
Cc: gluster-users@xxxxxxxxxxx
Subject: Re: gluster peer probe failing
https://review.gluster.org/#/
c/17494/ will it and the next update of 3.10 should have this fix.If sysctl net.ipv4.ip_local_reserved_ports has any value > short int range then this would be a problem with the current version.
Would you be able to reset the reserved ports temporarily to get this going?
On Wed, Jun 14, 2017 at 8:32 PM, Guy Cukierman <guyc@xxxxxxxxxxx> wrote:
Hi,
I have a gluster (version 3.10.2) server running on a 3 node (centos7) cluster.
Firewalld and SELinux are disabled, and I see I can telnet from each node to the other on port 24007.
When I try to create the first peering by running on node1 the command:
gluster peer probe <node2 ip address>
I get the error:
“Connection failed. Please check if gluster daemon is operational.”
And Glusterd.log shows:
[2017-06-14 14:46:09.927510] I [MSGID: 106487] [glusterd-handler.c:1242:__
glusterd_handle_cli_probe] 0-glusterd: Received CLI probe req 192.168.1.17 24007 [2017-06-14 14:46:09.928560] I [MSGID: 106129] [glusterd-handler.c:3690:
glusterd_probe_begin] 0-glusterd: Unable to find peerinfo for host: 192.168.1.17 (24007) [2017-06-14 14:46:09.930783] W [MSGID: 106062] [glusterd-handler.c:3466:
glusterd_transport_inet_ options_build] 0-glusterd: Failed to get tcp-user-timeout [2017-06-14 14:46:09.930837] I [rpc-clnt.c:1059:rpc_clnt_
connection_init] 0-management: setting frame-timeout to 600 pending frames:
frame : type(0) op(0)
patchset: git://git.gluster.org/
glusterfs.git signal received: 11
time of crash:
2017-06-14 14:46:09
configuration details:
argp 1
backtrace 1
dlfcn 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.10.3
/lib64/libglusterfs.so.0(_gf_
msg_backtrace_nomem+0xa0)[ 0x7f69625da4d0] /lib64/libglusterfs.so.0(gf_
print_trace+0x324)[ 0x7f69625e3dd4] /lib64/libc.so.6(+0x35250)[
0x7f6960cbe250] /lib64/libglusterfs.so.0(gf_
ports_reserved+0x15c)[ 0x7f69625e444c] /lib64/libglusterfs.so.0(gf_
process_reserved_ports+0xbe)[ 0x7f69625e470e] /usr/lib64/glusterfs/3.10.3/
rpc-transport/socket.so(+ 0xd158)[0x7f6954568158] /usr/lib64/glusterfs/3.10.3/
rpc-transport/socket.so( client_bind+0x93)[ 0x7f69545685a3] /usr/lib64/glusterfs/3.10.3/
rpc-transport/socket.so(+ 0xa875)[0x7f6954565875] /lib64/libgfrpc.so.0(rpc_clnt_
reconnect+0xc9)[ 0x7f69623a3f89] /lib64/libgfrpc.so.0(rpc_clnt_
start+0x39)[0x7f69623a4049] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x24218)[0x7f69570f9218] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x24843)[0x7f69570f9843] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x24ae0)[0x7f69570f9ae0] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x27890)[0x7f69570fc890] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x27e20)[0x7f69570fce20] /usr/lib64/glusterfs/3.10.3/
xlator/mgmt/glusterd.so(+ 0x20f5e)[0x7f69570f5f5e] /lib64/libglusterfs.so.0(
synctask_wrap+0x10)[ 0x7f6962611750] /lib64/libc.so.6(+0x46cf0)[
0x7f6960ccfcf0]
And a file is create under /var/lib/glusterd/peers/<node2 ip address> which contains:
uuid=00000000-0000-0000-0000-
000000000000 state=0
hostname1=192.168.1.17
and the glusterd daemon exits and I cannot restart it until I delete this file from the peers folder.
Any idea what is wrong?
thanks!
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users