| > | And so I guess that this NET_XMIT_DROP should be ignored by dccp code? | > | > In the test tree it is now (almost) ignored. Since the time you reported | > this problem I have changed the DCCP_BUG to a DCCP_WARN, so that the | > drop will still be logged, but there will be fewer such warnings in the | > log now (in DCCP_WARN, printk is rate-limited). | > | I'm not sure it should print the warning. Ok, it's nice to know that the | packet was dropped but: | a) in real world there would be no such information, | b) it's not something unusual for DCCP to lose packet without warning. | Maybe it should only warn if debugging option is enabled? | Ok that is a good point. What I will do is to change the DCCP_WARN() for this error message to a dccp_pr_debug(), so that the message will only get printed for debugging purposes. | > | > * to avoid this, there is a kernel configuration option of CCID-3 | > | > to set an upper bound for this. | > | | > | How do I set it? | > | > In the menu under | > Networking -> Network Options -> The DCCP Protocol (EXPERIMENTAL) | > -> DCCP CCIDs Configuration (EXPERIMENTAL) -> CCID3 | > -> (100) Use higher bound for nofeedback timer | > Ah - just remembered -- the default is 100 milliseconds, so this will | > probably have caught the problems with the low RTT. | > | 100ms? Then it doesn't work as expected because adding just 1ms delay with | netem fixed the problem. | If you meant 100us then it still doesn't work. Changing the parameter to 1000 | delays the bug - I have to send more packets for it to happen. | It is not as easy as that. This value determines a lower bound, in order to cope with very low RTTs (i.e. less than 1 millisecond). What this value changes is when the nofeedback timer is triggered. This is normally max(4*RTT, 2*s/X), so a minimum of 4 RTT. But when RTTs are low, it can happen that the nofeedback timer is triggered several times between sending two frames (e.g. a VoIP inter-frame interval of 20ms). The configuration value sets a minimum lower bound of timeout = max(CONFIG_RTO_MIN, max(4*RTT, 2*s/X)) to cope with the problem of low RTTs. You are calling this a bug -- I think it is quite likely that CCID-3 is simply not meant for the way you are using it. Please see below. | > So from what you wrote I read | > * without additional delay the described problem occurs and CCID-3 | > gets into 1-packet-per-64-seconds mode | > * when you add 1millisecond delay to the interface then it works ok. | > | That's exactly what I meant. After adding this 1ms delay I was not able to | reproduce the bug. This does not mean it would not happen after long enough | testing. | Thanks for providing the dccp_probe data. I have had a look at it and it is a bit as expected. The value of the RTT is: min: 3usec, avg: 48.0 usec, max: 4.513 msec; with stddev of 324.68. The loss was all the time 0, the receiver reported something like 1.8 kbps (230 bytes per second). The start-up behaviour is that there is a spike in the RTT where it reaches its maximum right at the beginning (4.5 msec); this soon fades out after the first 5 seconds, after which it reaches the average of 48 microseconds. This RTT is about 5..10 times less than a standard PC RTT (250..500 microseconds). Now I have a question: adding 1 ms delay at the interface avoided the hang-up, but what do you mean by "long enough"? Is this 60 seconds? I have attached the plot of the t_ipi which indeed climbs to astronomical values after the initial period (when the RTT was low). But since the average RTT is 48 microseconds in the `outfile', I am assuming that the outfile was produced when the delay was not added to the interface? There is a long-standing problem (at least 1 year) with regard to CCID-3 over high-speed networks (and lo interface is in fact a high-speed interface): high link speeds are outside the control region of CCID-3. The peak limit of controllable speed is about 12 Mbps. Everything higher that will cause problems. I have not tested, but you may be able to also silence this behaviour by using a netem token-bucket filter with e.g. a maximum bitrate of <= 10 Mbps. So I think the least thing we need to do is put a warning into the DCCP Wiki that people should add interface delay when using CCID-3 for testing over loopback.
Attachment:
t_ipi.png
Description: PNG image