Re: TCP Connection times out

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday 10 February 2003 11:15, David S. Miller wrote:
>>    From: Martin Zielinski <mz@seh.de>
>>    Date: Mon, 10 Feb 2003 09:44:21 +0100
>>
>>    There seems to be a bug in net/ipv4/tcp_output.c - tcp_send_probe0(...):
>>    During a tcp write the receiver can hold the connection by respondig
>> with ACKs window size 0. In this situation the sender (linux) has timeouts
>> between sending ACKs to test, if the connection is still valid.
>>    This timeout is controlled by "min(tp->rto << tp->backoff,
>> TCP_RTO_MAX)". tp->backoff is allways increased by 1.
>>    On a 32 bit machine at least a tp->backoff value of 32 results in a 0
>> for this expression.

> For retransmits it can never reach the value 32 bacause the
> backoff bumping there is capped by sysctl_tcp_retries2 which by
> default is TCP_RETR2 or 16.

I'm no kernel developer so I might overlook something...

sysctl_tcp_retries2 seems to be be responsible for tp->probes_out in 
tcp_timer.c - tcp_probe_timer().
I see no place here, where tp->backoff is reset or limited by this value. It 
is only a counter to double the wait cycles between retries - not more.
sk->dead seems to stay 0 although I did not track down, where this is 
controlled.

>
> And for the probe case, it is limited also by the same value.
> See the tests in tcp_probe_timer(), where 'max_probes' is assigned
> to sysctl_tcp_retries2, and tcp_send_probe0() is only invoked
> if "tp->probes_out" is less than this.  

Obviously. After reaching the count, the connection breaks. 
This condition should not be reached until armageddon - as long, as the client 
responses with ACKs window size zero (see the comment by "ANK" in the code).

> Every time tcp_send_probe0()
> increases "tp->probes_out" it also increases "tp->backoff".

I think, you overlooked the reseting of tp->probes_out in 
tcp_input.c - tcp_ack()  to 0 with each ACK coming back from the receiver.
tp->packets_out must be 0 - otherwise  the relevant code in tcp_send_probe0 
would not be reached so the code jumps to the "no_queue" label.

>
> So I don't think tp->backoff can ever reach 32.  Did you add debugging
> statements to tcp_send_probe0() to find this out?  What exactly did
> these debugging statements print out for you?

I have printks in the code. To track down our problem we set TCP_MAX_RTO to 
1*HZ to speed things up and added:

printk ("backoff = %d, probes_out = %d\n", tp->backoff, tp->probes_out);

bevor the tcp_reset_xmit_timer() statement in tcp_output.c - 
tcp_send_probe0().

The output shows, that tp->backoff is allways increased and tp->probes_out is 
1 (increased just before).
It takes 20 seconds then, to get the ACK burst, when the timeout becomes zero.

Please note, that this is a valid connection held open by the receiver! 

The sysctl_tcp_retries2 value is (as far as I understand this) for declaring 
the connection broken, if there is *NO* response from the receiver.

I could provide a tcpdump trace, if wanted.

Bye,
Martin

-- 
Martin Zielinski       mz@seh.de

-
: send the line "unsubscribe linux-net" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Netdev]     [Ethernet Bridging]     [Linux 802.1Q VLAN]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Git]     [Bugtraq]     [Yosemite News and Information]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux PCI]     [Linux Admin]     [Samba]

  Powered by Linux