Re: rpc_client_ping_timer_expired logic

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 02/04/2016 08:26 PM, Khoi Mai wrote:
Hi Gluster community,

Can someone who has insight on how rpc_client_ping_timer_expired operates, I would love to learn more about.   The reason behind it is that last week I had 2 fuse clients produce the same disconnect message, but reconnected immediately afterwards.  What I'd like to know is what may have caused it to behave this way and where else I can look to build and understanding of root cause.  The gluster node does show the same disconnect/reconnect.

The way it works is, when a first message is sent to the server, ping rpc is sent to server and a 42 seconds timer is started by default (It can be changed with network.ping-timeout).
          If the ping response comes it will stop the earlier timer and will start a 42 second timer again for next ping message.
          If the ping response doesn't come in 42 seconds timer expires at that point if there was no transport activity where some other messages were sent/received the transport gets disconnected and reconnect is attempted. Otherwise it think the ping response may come after some more time so delays the timer by 42 more seconds to see if the response comes.

Pranith

Jan 28 14:25:27 omhq1cab GlusterFS[1640]: [2016-01-28 20:25:27.685703] C [client-handshake.c:127:rpc_client_ping_timer_expired] 0-prodstatic-client-3: server 72.36.4.204:49155 has not responded in the last 10 seconds, disconnecting.


Jan 28 14:24:52 omhq1ca9 GlusterFS[1612]: [2016-01-28 20:24:52.589450] C [client-handshake.c:127:rpc_client_ping_timer_expired] 0-prodstatic-client-3: server 72.36.4.204:49155 has not responded in the last 10 seconds, disconnecting.

My setup for the volume is as follows:  Brick4 was the one that appeared not responding to the clients.  I have an environment where multiple clients(30+)  mount this volume and none of them had any issues with Brick4 logged.

Volume Name: prodstatic
Type: Distributed-Replicate
Volume ID: 187c241d-0eeb-4405-80f2-c704ea44bc36
Status: Started
Number of Bricks: 2 x 4 = 8
Transport-type: tcp
Bricks:
Brick1: server1140:/export/content/static
Brick2: server1c5d:/export/content/static
Brick3: server11ad:/export/content/static
Brick4: server1781:/export/content/static
Brick5: server1c56:/export/content/static
Brick6: server1c58:/export/content/static
Brick7: server1c57:/export/content/static
Brick8: server1c59:/export/content/static
Options Reconfigured:
network.ping-timeout: 10
server.allow-insecure: on
features.quota: on

Thanks
Khoi


**



This email and any attachments may contain information that is confidential and/or privileged for the sole use of the intended recipient. Any use, review, disclosure, copying, distribution or reliance by others, and any forwarding of this email or its contents, without the express permission of the sender is strictly prohibited by law. If you are not the intended recipient, please contact the sender immediately, delete the e-mail and destroy all copies.

**


_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users

[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux