Clements , thanks for all the detailed explanation . I think things are clear to me now . Will try to apply the changes in sshd_config . And Thanks Michael and all for providing insights on the issue . Thanks Zaman On Fri, Jun 11, 2010 at 5:22 AM, Glynn Clements <glynn@xxxxxxxxxxxxxxxxxx> wrote: > > query wrote: > >> >> okay..Thanks for the clarification . Since the host sometimes >> >> continues to remain busy for around 2 hours , >> > >> > Busy to the point that ssh/sshd doesn't get *any* CPU time for 2 >> > hours? Either you're misunderstanding something, or that's a seriously >> > misconfigured server. >> >> That is my misunderstanding only .The CPU is 100% busy but it is not >> that all the 100% is being utilized by our process and no other >> process is getting the CPU time. I will calculate an optimal value by >> going through once more over the system during the peak CPU >> utilization . >> But I am still confused who is terminating the connection in our case >> and on how is calculating the timeout value. >> AS you mentioned in your first comment that it the kernel who is >> terminating the connection , but based on what it is terminating >> the connection . As you said earlier , Keep-alive allows us to detect >> that a host is unreachable (e.g. >> network failure, system crash, power failure, etc) , It is not going >> to kill sshd , > > It won't kill sshd; however, if packets (data or keep-alives) which > are sent to the client stop being acknowledged, operations on the > socket will eventually fail with ETIMEDOUT. At this point, sshd will > close the connection of its own accord. > > The relevant setting is /proc/sys/net/ipv4/tcp_retries2: > > tcp_retries2 (integer; default: 15; since Linux 2.2) > The maximum number of times a TCP packet is retransmitted in > established state before giving up. The default value is 15, > which corresponds to a duration of approximately between 13 to > 30 minutes, depending on the retransmission timeout. The > RFC 1122 specified minimum limit of 100 seconds is typically > deemed too short. > > The initial retransmission timeout is determined by the measured > round-trip latency for the connection. Subsequent retransmissions > occur at exponentially increasing intervals, capped at 120 seconds. > > If keep-alives aren't being sent, the connection can only time out as > a result of data being sent. If keep-alives are being sent, a timeout > can occur even if the connection is otherwise idle (that's the purpose > of keep-alives). > > -- > Glynn Clements <glynn@xxxxxxxxxxxxxxxxxx> > -- To unsubscribe from this list: send the line "unsubscribe linux-admin" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html