Hi Neal,
On Fri, 18 Dec 2020, Neal Cardwell wrote:
On Wed, Dec 16, 2020 at 2:39 PM Markku Kojo <kojo@xxxxxxxxxxxxxx> wrote:
> For (2), the RTO timer is still operative so
> the RTO recovery rules would still follow.
In short:
When with a non-RACK-TLP implementation timer (RTO) expires: cwnd=1 MSS,
and slow start is entered.
When with a RACK_TLP implementation timer (PTO) expires,
normal fast recovery is entered (unless implementing
also PRR). So no RTO recovery as explicitly stated in Sec. 7.4.1.
This means that this document explicitly modifies standard TCP congestion
control when there are no acks coming and the retransmission timer
expires
from: RTO=SRTT+4*RTTVAR (RTO used for arming the timer)
It's also worth mentioning this aspect of [RFC6298]:
Sure.
(2.4) Whenever RTO is computed, if it is less than 1 second, then the
RTO SHOULD be rounded up to 1 second.
1. RTO timer expires
2. cwnd=1 MSS; ssthresh=FlightSize/2; rexmit one segment
3. Ack of rexmit sent in step 2 arrives
4. cwnd = cwnd+1 MSS; send two segments
...
to: PTO=min(2*SRTT,RTO) (PRO used for arming the timer)
1. PTO times expires
2. (cwnd=1 MSS); (re)xmit one segment
It may be worthwhile to point out here that the RACK-TLP draft does not specify setting cwnd
to 1 at this point, and the Linux TCP implementation from our team does not do this. The
Yes, that's why I put it in parenthesis. In my view the RACK-TLP
draft implicitly limits cwnd to one segment by allowing just one TLP
probe segment.
rationale is that at this point there is no solid evidence that anything has been lost, and
setting cwnd to 1 at this point would make the algorithm more timid than the preceding
approaches, for no good reason.
Sure, no need to set cwnd at this point.
A good reason could be: No feedback, Ack clock lost? But, of course,
it is too early even though after the arrival of ack the sender may well
modify cwnd again. Like it now does, if it decides it was loss other than
probe segment.
3. Ack of (re)xmit sent in srep 2 arrives
4. cwnd = ssthresh = FlightSize/2; send N=cwnd segments
That step (4) assumes a particular congestion control implementation that is different than
what we would recommend.
Ok. I just used the Standards Track formula as does the RACK-TLP draft in
its examples. And because RACK-TLP draft states it does not modify
current congestion control.
For example, if FlightSize is 100 segments when timer expires,
congestion control is the same in steps 1-3, but in step 4 the
current standard congestion control allows transmitting 2 segments,
while RACK-TLP would allow blasting 50 segments.
Question is: what is the justification to modify standard TCP
congestion control to use fast recovery instead of slow start for a
case where timeout is needed to detect loss because there is no
feedback and ack clock is lost? The draft does not give any
justification. This clearly is in conflict with items (0) and (1)
in BCP 133 (RFC 5033).
The draft pointedly does not modify standard TCP congestion control.
RACK-TLP does not specify using fast recovery instead of slow start for a case where timeout
is needed to detect loss because there is no feedback and the ACK clock is lost. Rather,
RACK-TLP only triggers fast recovery if there *is* ACK feedback providing an ACK clock and
strong evidence of a packet loss.
So here our views diverge. In the above steps I decoupled congestion
control from what segments are sent (rexmit and xmit are mentioned there
just as comments to check what is going on, they can be freely removed).
Congestion control governs how many segments can be sent.
In my view, when there is no feedback RACK TLP uses timeout (PTO) to help
make progress. Without the timeout it cannot make progress. Just like
an RFC 5681 sender, it cannot make progress until timeout expires.
So this should be taken as the criteria to (effectively) enter slow start,
once loss becomes detected.
Or, at least I don't see any difference why different timeout value would
change the congestion control.
When timeout expires RACK-TLP sends one segment (just like an RFC 5681
sender when RTO expires). The only difference is that RFC 5681 sender
selects a different segment (first unacknowledged segment) to retransmit
"blindly" in order to get feedback and start ACK clock. RACK-TLP sends
"blindly" the last segment from the retransmission queue (or a new
segment). Selecting a different segment for transmission upon timeout
does not change anything, in my view. In both cases it is a "blind"
selection; the sender does not know what was lost. And in both cases the
ACK for this one segment provides feedback about what potentially has
been lost. There the only difference is that the segment that RACK-TLP
selected to transmit is a better choice when SACK option is use because
it provides more information.
If there is some difference in that the ACK for RACK-TLP provides
stronger evidence for packet loss (and what was lost), then it should be
also ok to modify the current standard TCP congestion control such that
upon RTO timeout the sender does not select the first unacknowledged
segment for blind retransmission but the last segment in the
retransmission queue (or maybe a new segment). With SACK this would
provide exactly the same information as TLP probe does. And, upon arrival
of the first ACK, RTO recovery would use similar rules as in RACK-TLP to
better decide whether it was spurious RTO or loss and move from slow
start to fast recovery and set cwnd=ssthresh.
I really don't see how this change in "blindly" retrasmitted first segment
in slow start would allow modifying congestion control for RTO recovery.
The main aspect of triggering loss recovery that is new is the approach of allowing a sender
to transmit one additional "probe" segment in flight after 2*SRTT. Once this is accepted, the
rest of the recovery process essentially follows from principles already generally accepted
in the IETF TCP community.
Could you please see above and explain (or provide a pointer to an RFC)
what are those "principles already generally accepted in the IETF TCP
community". That would help me to understand your point.
Put another way, it seems to me that if one is to object to TLP-triggered fast recovery, then
the objection must be mounted specifically against the permission granted to the sender to
transmit one additional "probe" segment in flight after 2*SRTT. Once that permission is
granted, there is nothing really new about TLP-triggered fast recovery.
I am sorry but I still fail to see what is the preceding evidence that
makes this not new. A pointer could help.
In my view the probe is not anything to object as long as it is not
considered as a cwnd increase in the later cwnd&ssthresh calculation
(a minor detail, but someone might later suggest first two then 4 and so
on probe segments with the justufication that it is just one more than
earlier).
Furthermore, there is no implementation nor experimental experience
evaluating this change. The implementation with experimental experience
uses PRR (RFC 6937) which is an Experimental specification including a
novel "trick" that directs PRR fast recovery to effectively use slow
start in this case at hand.
What do you think of Yuchung's latest suggestion for new text in "9.3. Interaction with
congestion control" suggested by Yuchung Thursday afternoon (Dec 17), which explicitly
recommends PRR? As mentioned earlier in this thread, there is considerable implementation and
experimental experience with RACK-TLP plus PRR since the Linux TCP stack has been using
RACK-TLP with PRR as the default loss recovery algorithm since Linux v4.18 in August 2018.
As I have already indicated, in my view PRR does not have the problem we
are discussing here because PRR-SSRB makes fast recovery to behave like
slow start. And PRR-CRB is even more conservative. So it would be a safe
choice for this problem unlike the current RFC 6675 algorithm.
In other words, I only object allowing the use of RACK-TLP with the
RFC 6675 congestion control algorithm unmodified because it does not have
a safeguard like PRR. This does not mean that RACK-TLP document would
need to include the necessary modifications to the RFC 6675 algorithm.
I don't know processwise but PRR possibly cannot be used as normative
requirement because it is currently Experimental? Not quite sure though.
Best regards,
/Markku
The exact commit is:
b38a51fec1c1 tcp: disable RFC6675 loss detection
best,
neal
--
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call