The first figure certainly demonstrates a problem. However, that
problem is not inherent in CCID3, it is not inherent in rate-based
solutions, and high-rate timers probably wouldn't solve it. CCID3 has
been tested -- in simulation mind you -- at high rates. The problem is
a bug in the Linux implementation. Ian seems to think he can solve the
problem with bursts and I am inclined to agree.
Your comments about X_crit are based on your observations, not analysis,
yes? If you can provide some reason why CCID3 inherently has an X_crit,
I'd like to hear it. "Oscillat[ing] between the top available speed ...
and whatever it gets in terms of feedback" is not TFRC. Sounds like a bug.
I agree that kernel maintainers don't want bugs in the kernel.
Anyway, if you can go deeper into the code and determine why you're
observing this behavior (I assume in the absence of loss, which is even
weirder), then that might be useful.
Eddie
Gerrit Renker wrote:
Quoting Eddie Kohler:
| > The problem I see is that
| > * scheduling granularity is at best 1ms
| > * hence t_gran/2 is at best 500usec
| > * and so when t_ipi < 1ms, packets will always be sent in bursts
| >
| > So we have a `critical point' after which the average sending rate is dominated by the
| > hardware and no longer under the rate-based control of t_ipi.
|
| I think there might be a confusion about what "average sending rate"
| actually means. You appear to assume that precise timing control is
| needed to obtain a smooth average. I.e., if there are bursts, then
| there is no smooth average. But this isn't what the RFC means by
| average. The average rate is suppsoed to be smooth *from one RTT to the
| next*. Sub-RTT burstiness is *explicitly allowed*, although
| implementations should try to avoid it when it's easy to do so.
Sorry, I don't think you got my point. The problem is NOT in occasional bursts, but rather
that there is a critical speed X_crit after which the system essentially gets completely out
of control.
If bursts would occur occasionally then it would be like a car which is speeding up for a moment,
but then slows down again (and thus keeps the average speed) - agree that this would not be a problem.
Here however we have that, once X_crit is reached, the system will _oscillate_ between the top
available speed (and this could be hundreds of Mbits per second) and whatever it gets in terms
of feedback from the receiver. After slowing down, it would again climb in slow-start up to
X_crit, then jump to top speed. Thus it is like a car which will abruptly switch to top speed
of e.g. 160 mph when trying to accelerate past 2 mph.
I have a snapshot which illustrates this state:
http://www.erg.abdn.ac.uk/users/gerrit/dccp/dccp_probe/examples/no_tx_locking/transmit_rate.png
The oscillating behaviour is well visible. In contrast, I am sure that you would agree that the
desirable state is the following:
http://www.erg.abdn.ac.uk/users/gerrit/dccp/dccp_probe/examples/with_tx_locking/transmit_rate.png
These snapshots were originally taken to compare the performance with and without serializing access to
TX history. I didn't submit the patch since, at times, I would get the same chaotic behaviour with TX locking.
Other people on this list have reported that iperf performance is unpredictable with CCID 3.
The point is that, without putting in some kind of control, we have a system which gets into a state of
chaos as soon as the maximum controllable speed X_crit is reached. When it is past that point, there is
no longer a notion of predictable performance or correct average rate: what happens is then outside the
control of the CCID 3 module, performance is then a matter of coincidence.
I don't think that a kernel maintainer will gladly support a module which is liable to reaching such a
chaotic state.
| > I have done a back-of-the-envelope calculation below for different sizes of s; 9kbyte
| > I think is the maximum size of an Ethernet jumbo frame.
| >
| > -----------+---------+---------+---------+---------+-------+---------+-------+
| > s | 32 | 100 | 250 | 500 | 1000 | 1500 | 9000 |
| > -----------+---------+---------+---------+---------+-------+---------+-------+
| > X_critical| 32kbps | 100kbps | 250kbps | 500kbps | 1mbps | 1.5mbps | 9mbps |
| > -----------+---------+---------+---------+---------+-------+---------+-------+
| >
| > That means we can only expect predictable performance up to 9mbps ?????
|
| Same comment. I imagine performance will be predictable at speeds FAR
| ABOVE 9mbps, DESPITE the sub-RTT bursts. Predictable performance means
| about the same average rate from one RTT to the next.
I think that, without finer timer resolution, we need to put in some kind of throttle to avoid
entering the region where speed can no longer be controlled.
| > I am dumbstruck - it means that the whole endeavour to try and use Gigabit cards (or
| > even 100 Mbit ethernet cards) is futile and we should be using the old 10 Mbit cards???
|
| Remember that TCP is ENTIRELY based on bursts!!!!! No rate control at
| all. And it still gets predictable performance at high rates.
|
Yes, but ..... it uses an entirely different mechanism and is not rate-based.
-
To unsubscribe from this list: send the line "unsubscribe dccp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html