On Sat, 2 Dec 2017 at 19:29, Jo Goossens <jo.goossens@xxxxxxxxxxxxxxxx> wrote:
Hello Atin,
Could you confirm this should have been fixed in 3.10.8? If so we'll test it for sure!
Fix should be part of 3.10.8 which is awaiting release announcement.
RegardsJo
-----Original message-----
From: Atin Mukherjee <amukherj@xxxxxxxxxx>Sent: Mon 30-10-2017 17:40
Subject: Re: BUG: After stop and start wrong port is advertised
To: Jo Goossens <jo.goossens@xxxxxxxxxxxxxxxx>;
CC: gluster-users@xxxxxxxxxxx;
On Sat, 28 Oct 2017 at 02:36, Jo Goossens <jo.goossens@xxxxxxxxxxxxxxxx> wrote:Hello Atin,
I just read it and very happy you found the issue. We really hope this will be fixed in the next 3.10.7 version!
3.10.7 - no I guess as the patch is still in review and 3.10.7 is getting tagged today. You’ll get this fix in 3.10.8.
PS: Wow nice all that c code and those "goto out" statements (not always considered clean but the best way often I think). Can remember the days I wrote kernel drivers myself in c :)
Regards
Jo Goossens
-----Original message-----
From: Atin Mukherjee <amukherj@xxxxxxxxxx>
Sent: Fri 27-10-2017 21:01
Subject: Re: BUG: After stop and start wrong port is advertised
To: Jo Goossens <jo.goossens@xxxxxxxxxxxxxxxx>;
CC: gluster-users@xxxxxxxxxxx;
We (finally) figured out the root cause, Jo!Patch https://review.gluster.org/#/c/18579 posted upstream for review.
On Thu, Sep 21, 2017 at 2:08 PM, Jo Goossens <jo.goossens@xxxxxxxxxxxxxxxx> wrote:Hi,
We use glusterfs 3.10.5 on Debian 9.
When we stop or restart the service, e.g.: service glusterfs-server restart
We see that the wrong port get's advertised afterwards. For example:
Before restart:
Status of volume: publicGluster process TCP Port RDMA Port Online Pid------------------------------------------------------------------------------Brick 192.168.140.41:/gluster/public 49153 0 Y 6364Brick 192.168.140.42:/gluster/public 49152 0 Y 1483Brick 192.168.140.43:/gluster/public 49152 0 Y 5913Self-heal Daemon on localhost N/A N/A Y 5932Self-heal Daemon on 192.168.140.42 N/A N/A Y 13084Self-heal Daemon on 192.168.140.41 N/A N/A Y 15499Task Status of Volume public------------------------------------------------------------------------------There are no active volume tasksAfter restart of the service on one of the nodes (192.168.140.43) the port seems to have changed (but it didn't):root@app3:/var/log/glusterfs# gluster volume statusStatus of volume: publicGluster process TCP Port RDMA Port Online Pid------------------------------------------------------------------------------Brick 192.168.140.41:/gluster/public 49153 0 Y 6364Brick 192.168.140.42:/gluster/public 49152 0 Y 1483Brick 192.168.140.43:/gluster/public 49154 0 Y 5913Self-heal Daemon on localhost N/A N/A Y 4628Self-heal Daemon on 192.168.140.42 N/A N/A Y 3077Self-heal Daemon on 192.168.140.41 N/A N/A Y 28777Task Status of Volume public------------------------------------------------------------------------------There are no active volume tasksHowever the active process is STILL the same pid AND still listening on the old portroot@192.168.140.43:/var/log/glusterfs# netstat -tapn | grep glustertcp 0 0 0.0.0.0:49152 0.0.0.0:* LISTEN 5913/glusterfsdThe other nodes logs fill up with errors because they can't reach the daemon anymore. They try to reach it on the "new" port instead of the old one:[2017-09-21 08:33:25.225006] E [socket.c:2327:socket_connect_finish] 0-public-client-2: connection to 192.168.140.43:49154 failed (Connection refused); disconnecting socket[2017-09-21 08:33:29.226633] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 0-public-client-2: changing port to 49154 (from 0)[2017-09-21 08:33:29.227490] E [socket.c:2327:socket_connect_finish] 0-public-client-2: connection to 192.168.140.43:49154 failed (Connection refused); disconnecting socket[2017-09-21 08:33:33.225849] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 0-public-client-2: changing port to 49154 (from 0)[2017-09-21 08:33:33.236395] E [socket.c:2327:socket_connect_finish] 0-public-client-2: connection to 192.168.140.43:49154 failed (Connection refused); disconnecting socket[2017-09-21 08:33:37.225095] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 0-public-client-2: changing port to 49154 (from 0)[2017-09-21 08:33:37.225628] E [socket.c:2327:socket_connect_finish] 0-public-client-2: connection to 192.168.140.43:49154 failed (Connection refused); disconnecting socket[2017-09-21 08:33:41.225805] I [rpc-clnt.c:2000:rpc_clnt_reconfig] 0-public-client-2: changing port to 49154 (from 0)[2017-09-21 08:33:41.226440] E [socket.c:2327:socket_connect_finish] 0-public-client-2: connection to 192.168.140.43:49154 failed (Connection refused); disconnecting socketSo they now try 49154 instead of the old 49152Is this also by design? We had a lot of issues because of this recently. We don't understand why it starts advertising a completely wrong port after stop/start.
Regards
Jo Goossens
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users--- Atin (atinm)
--
- Atin (atinm)
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users