On Monday 10 February 2003 11:15, David S. Miller wrote: >> From: Martin Zielinski <mz@seh.de> >> Date: Mon, 10 Feb 2003 09:44:21 +0100 >> >> There seems to be a bug in net/ipv4/tcp_output.c - tcp_send_probe0(...): >> During a tcp write the receiver can hold the connection by respondig >> with ACKs window size 0. In this situation the sender (linux) has timeouts >> between sending ACKs to test, if the connection is still valid. >> This timeout is controlled by "min(tp->rto << tp->backoff, >> TCP_RTO_MAX)". tp->backoff is allways increased by 1. >> On a 32 bit machine at least a tp->backoff value of 32 results in a 0 >> for this expression. > For retransmits it can never reach the value 32 bacause the > backoff bumping there is capped by sysctl_tcp_retries2 which by > default is TCP_RETR2 or 16. I'm no kernel developer so I might overlook something... sysctl_tcp_retries2 seems to be be responsible for tp->probes_out in tcp_timer.c - tcp_probe_timer(). I see no place here, where tp->backoff is reset or limited by this value. It is only a counter to double the wait cycles between retries - not more. sk->dead seems to stay 0 although I did not track down, where this is controlled. > > And for the probe case, it is limited also by the same value. > See the tests in tcp_probe_timer(), where 'max_probes' is assigned > to sysctl_tcp_retries2, and tcp_send_probe0() is only invoked > if "tp->probes_out" is less than this. Obviously. After reaching the count, the connection breaks. This condition should not be reached until armageddon - as long, as the client responses with ACKs window size zero (see the comment by "ANK" in the code). > Every time tcp_send_probe0() > increases "tp->probes_out" it also increases "tp->backoff". I think, you overlooked the reseting of tp->probes_out in tcp_input.c - tcp_ack() to 0 with each ACK coming back from the receiver. tp->packets_out must be 0 - otherwise the relevant code in tcp_send_probe0 would not be reached so the code jumps to the "no_queue" label. > > So I don't think tp->backoff can ever reach 32. Did you add debugging > statements to tcp_send_probe0() to find this out? What exactly did > these debugging statements print out for you? I have printks in the code. To track down our problem we set TCP_MAX_RTO to 1*HZ to speed things up and added: printk ("backoff = %d, probes_out = %d\n", tp->backoff, tp->probes_out); bevor the tcp_reset_xmit_timer() statement in tcp_output.c - tcp_send_probe0(). The output shows, that tp->backoff is allways increased and tp->probes_out is 1 (increased just before). It takes 20 seconds then, to get the ACK burst, when the timeout becomes zero. Please note, that this is a valid connection held open by the receiver! The sysctl_tcp_retries2 value is (as far as I understand this) for declaring the connection broken, if there is *NO* response from the receiver. I could provide a tcpdump trace, if wanted. Bye, Martin -- Martin Zielinski mz@seh.de - : send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html