Re: Review of draft-ietf-dccp-tfrc-rtt-option-00.txt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Michael,

thank you very much indeed for your review and the helpful comments.

I am working with Gorry on a new revision to incorporate your and
Eddie's comments. Before submitting this, I would like to turn to
the one section which you have correctly identified as guesswork.

> par 5: the problem described by the first two sentences: "The fifth  
> and last problem is starvation under burst loss..." is not clear  
...
> really measured, it _sounds_ like guesswork to me - this just needs  
> some more precise language.


To better find out if I understood something wrongly or if it is just the
description which is problematic, I would like to present the details.

The problem was observed on an 802.11g link on the ISM 2.4GHz band, which had
an average RTT of 2msec. From TCP wireshark traces it was clear that there was
interference on the channel (dupAcks and re- transmitted packets).

TCP streaming seemed to get along with occasional interference.

With UDP there were occasional 'holes' in the stream, which could successfully
be fixed by the application-layer FEC provided by the paraslash streamer
application.

With DCCP/CCID-3, however, the transmission occasionally "died", i.e.  sending
only one packet per t_mbi=64 seconds. This was accompanied by RTT out-of-bounds
warnings at the receiver:

 Jul 15 22:01:26 kernel: [ 2311.949466] dccp_sane_rtt: RTT sample  4766615 out of bounds!
 Jul 15 22:01:39 kernel: [ 2324.335916] dccp_sane_rtt: RTT sample 12373169 out of bounds!
 Jul 15 22:02:11 kernel: [ 2356.548447] dccp_sane_rtt: RTT sample 32193564 out of bounds!
 Jul 15 22:03:15 kernel: [ 2420.760223] dccp_sane_rtt: RTT sample 64201733 out of bounds!

These messages occur when the RTT is greater than 3,000,000 microseconds. Once it was past
this bound, the values approximately doubled each time. 

These are the facts, the rest is an interpretation, trying to figure out exactly what happened.
I have considered whether this is a bug of the implementation, but think it unlikely (all code
is open source and publicly available, it has also been checked a few times).

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Here is what I think happened:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 * first packet gets sent and arrives with normal link delay,
 * second packet with a CCVal difference between 1..4 is sent, but is delayed
   for more than 4 times the average RTT,
 * nofeedback timer at the sender is triggered after 4*RTT, halving X / doubling t_ipi,
 * there are now two different rates of change:
   - X is halved immediately (step reduction),
   - the RTT however passes through the low-pass filter
           RTT' = 0.9 * RTT + 0.1 * sample
     so that sample=10RTT means RTT'=1.9 RTT, sample=100RTT means RTT'=10.9RTT etc,
 * since RTT' is used for the CCVal window counter value, the change-rate for the
   CCVal window counter is also slower than the change-rate of X,
 * the receiver then "sees" a larger inter-packet gap caused by the immediate change of X,
   accompanied by an almost unchanged rate of change for the CCVal values,
 * this has the effect of doubling the sampled RTT at the receiver,
 * the receiver can not counter-act sudden changes in the RTT, since it has typically
   fewer usable samples than the sender, so also the effective RTT increases.

I think that the different change rates are a general problem, and this would be what the
draft concentrates on.

But why did the RTT samples double each time:
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 
 * if the receiver sees a doubling of the RTT, it sends feedback only at half the speed,
 * if this process happens just twice, the receiver sends feedback roughly every 4*RTT, which
   is enough to trigger again the nofeedback timer at the sender,
 * which then causes the process to start over again, until finally sending 1 packet / 64 seconds.

I have lost the traces, but believe that the problem can be reproduced with standard 2.4GHz access
points in environments where there is contention on the ISM band, and other interference
from DECT cordless phones, BlueTooth, and microwave ovens.

Gerrit


[Index of Archives]     [Linux Kernel Development]     [Linux DCCP]     [IETF Annouce]     [Linux Networking]     [Git]     [Security]     [Linux Assembly]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [DDR & Rambus]

  Powered by Linux