Re: [Last-Call] [tcpm] Last Call: <draft-ietf-tcpm-rack-13.txt>(TheRACK-TLPlossdetectionalgorithmfor TCP) to Proposed Standard

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Neal,

On Fri, 18 Dec 2020, Neal Cardwell wrote:



On Wed, Dec 16, 2020 at 2:39 PM Markku Kojo <kojo@xxxxxxxxxxxxxx> wrote:
      > For (2), the RTO timer is still operative so
      > the RTO recovery rules would still follow.

      In short:
      When with a non-RACK-TLP implementation timer (RTO) expires: cwnd=1 MSS,
      and slow start is entered.
      When with a RACK_TLP implementation timer (PTO) expires,
      normal fast recovery is entered (unless implementing
      also PRR). So no RTO recovery as explicitly stated in Sec. 7.4.1.

      This means that this document explicitly modifies standard TCP congestion
      control when there are no acks coming and the retransmission timer
      expires

      from: RTO=SRTT+4*RTTVAR (RTO used for arming the timer)

It's also worth mentioning this aspect of [RFC6298]:

Sure.

   (2.4) Whenever RTO is computed, if it is less than 1 second, then the
         RTO SHOULD be rounded up to 1 second.
 
             1. RTO timer expires
             2. cwnd=1 MSS; ssthresh=FlightSize/2; rexmit one segment
             3. Ack of rexmit sent in step 2 arrives
             4. cwnd = cwnd+1 MSS; send two segments
             ...

      to:   PTO=min(2*SRTT,RTO) (PRO used for arming the timer)
             1. PTO times expires
             2. (cwnd=1 MSS); (re)xmit one segment


It may be worthwhile to point out here that the RACK-TLP draft does not specify setting cwnd
to 1 at this point, and the Linux TCP implementation from our team does not do this. The

Yes, that's why I put it in parenthesis. In my view the RACK-TLP draft implicitly limits cwnd to one segment by allowing just one TLP probe segment.

rationale is that at this point there is no solid evidence that anything has been lost, and
setting cwnd to 1 at this point would make the algorithm more timid than the preceding
approaches, for no good reason.

Sure, no need to set cwnd at this point.

A good reason could be: No feedback, Ack clock lost? But, of course, it is too early even though after the arrival of ack the sender may well modify cwnd again. Like it now does, if it decides it was loss other than probe segment.
  
             3. Ack of (re)xmit sent in srep 2 arrives
             4. cwnd = ssthresh = FlightSize/2; send N=cwnd segments


That step (4) assumes a particular congestion control implementation that is different than
what we would recommend.

Ok. I just used the Standards Track formula as does the RACK-TLP draft in its examples. And because RACK-TLP draft states it does not modify current congestion control.

      For example, if FlightSize is 100 segments when timer expires,
      congestion control is the same in steps 1-3, but in step 4 the
      current standard congestion control allows transmitting 2 segments,
      while RACK-TLP would allow blasting 50 segments.

      Question is: what is the justification to modify standard TCP
      congestion control to use fast recovery instead of slow start for a
      case where timeout is needed to detect loss because there is no
      feedback and ack clock is lost? The draft does not give any
      justification. This clearly is in conflict with items (0) and (1)
      in BCP 133 (RFC 5033).


The draft pointedly does not modify standard TCP congestion control.

RACK-TLP does not specify using fast recovery instead of slow start for a  case where timeout
is needed to detect loss because there is no  feedback and the ACK clock is lost. Rather,
RACK-TLP only triggers fast recovery if there *is* ACK feedback providing an ACK clock and
strong evidence of a packet loss.

So here our views diverge. In the above steps I decoupled congestion control from what segments are sent (rexmit and xmit are mentioned there just as comments to check what is going on, they can be freely removed).
Congestion control governs how many segments can be sent.

In my view, when there is no feedback RACK TLP uses timeout (PTO) to help make progress. Without the timeout it cannot make progress. Just like an RFC 5681 sender, it cannot make progress until timeout expires. So this should be taken as the criteria to (effectively) enter slow start, once loss becomes detected.

Or, at least I don't see any difference why different timeout value would change the congestion control.

When timeout expires RACK-TLP sends one segment (just like an RFC 5681 sender when RTO expires). The only difference is that RFC 5681 sender selects a different segment (first unacknowledged segment) to retransmit "blindly" in order to get feedback and start ACK clock. RACK-TLP sends "blindly" the last segment from the retransmission queue (or a new segment). Selecting a different segment for transmission upon timeout does not change anything, in my view. In both cases it is a "blind" selection; the sender does not know what was lost. And in both cases the ACK for this one segment provides feedback about what potentially has been lost. There the only difference is that the segment that RACK-TLP selected to transmit is a better choice when SACK option is use because it provides more information.

If there is some difference in that the ACK for RACK-TLP provides stronger evidence for packet loss (and what was lost), then it should be also ok to modify the current standard TCP congestion control such that upon RTO timeout the sender does not select the first unacknowledged segment for blind retransmission but the last segment in the retransmission queue (or maybe a new segment). With SACK this would provide exactly the same information as TLP probe does. And, upon arrival of the first ACK, RTO recovery would use similar rules as in RACK-TLP to better decide whether it was spurious RTO or loss and move from slow start to fast recovery and set cwnd=ssthresh.

I really don't see how this change in "blindly" retrasmitted first segment in slow start would allow modifying congestion control for RTO recovery.

The main aspect of triggering loss recovery that is new is the approach of allowing a sender
to transmit one additional "probe" segment in flight after 2*SRTT. Once this is accepted, the
rest of the recovery process essentially follows from principles already generally accepted
in the IETF TCP community.

Could you please see above and explain (or provide a pointer to an RFC) what are those "principles already generally accepted in the IETF TCP community". That would help me to understand your point.

Put another way, it seems to me that if one is to object to TLP-triggered fast recovery, then
the objection must be mounted specifically against the permission granted to the sender to
transmit one additional "probe" segment in flight after 2*SRTT. Once that permission is
granted, there is nothing really new about TLP-triggered fast recovery.

I am sorry but I still fail to see what is the preceding evidence that makes this not new. A pointer could help.

In my view the probe is not anything to object as long as it is not considered as a cwnd increase in the later cwnd&ssthresh calculation (a minor detail, but someone might later suggest first two then 4 and so on probe segments with the justufication that it is just one more than earlier).

      Furthermore, there is no implementation nor experimental experience
      evaluating this change. The implementation with experimental experience
      uses PRR (RFC 6937) which is an Experimental specification including a
      novel "trick" that directs PRR fast recovery to effectively use slow
      start in this case at hand.


What do you think of Yuchung's latest suggestion for new text in "9.3.  Interaction with
congestion control" suggested by Yuchung Thursday afternoon (Dec 17), which explicitly
recommends PRR? As mentioned earlier in this thread, there is considerable implementation and
experimental experience with RACK-TLP plus PRR since the Linux TCP stack has been using
RACK-TLP with PRR as the default loss recovery algorithm since Linux v4.18 in August 2018.

As I have already indicated, in my view PRR does not have the problem we are discussing here because PRR-SSRB makes fast recovery to behave like slow start. And PRR-CRB is even more conservative. So it would be a safe choice for this problem unlike the current RFC 6675 algorithm.

In other words, I only object allowing the use of RACK-TLP with the RFC 6675 congestion control algorithm unmodified because it does not have a safeguard like PRR. This does not mean that RACK-TLP document would need to include the necessary modifications to the RFC 6675 algorithm.

I don't know processwise but PRR possibly cannot be used as normative requirement because it is currently Experimental? Not quite sure though.

Best regards,

/Markku

The exact commit is:

  b38a51fec1c1 tcp: disable RFC6675 loss detection

best,
neal
-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call

[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Mhonarc]     [Fedora Users]

  Powered by Linux