Re: [Last-Call] [tcpm] Genart last call review of draft-ietf-tcpm-rto-consider-14

Stewart Bryant <stewart.bryant@xxxxxxxxx> · Thu, 18 Jun 2020 11:00:15 +0100



    See a few comments (marked GF)
      from the perspective of other transport RFCs, in case this helps
      you find text...

      
      -------- Forwarded Message --------
      
        
            Subject:
            
            Re: [tcpm] Genart last call review of
              draft-ietf-tcpm-rto-consider-14
          
          
            Date: 
            Thu, 18 Jun 2020 11:00:15 +0100
          
          
            From: 
            Stewart Bryant <stewart.bryant@xxxxxxxxx>
          
          
            To: 
            Martin Duke <martin.h.duke@xxxxxxxxx>
          
          
            CC: 
            tcpm <tcpm@xxxxxxxx>, Review Team
              <gen-art@xxxxxxxx>, Mark Allman
              <mallman@xxxxxxxx>, Last Call
              <last-call@xxxxxxxx>, Stewart Bryant
              <stewart.bryant@xxxxxxxxx>, tom petch
              <daedulus@xxxxxxxxxxxxx>,
              draft-ietf-tcpm-rto-consider.all@xxxxxxxx
          
        
          On 17 Jun 2020, at 18:20, Martin Duke <martin.h.duke@xxxxxxxxx>
            wrote:
          

            Hi Stewart,
              

              If there are no further objections, I'm
                going to declare consensus.
            
            
              On Thu, Jun 11, 2020 at
                1:45 PM Martin Duke <martin.h.duke@xxxxxxxxx>
                wrote:

              
                Stewart,
                  

                  do we need more cycles for this, or is
                    draft-15 sufficient to address your concerns?
                
                
                  On Mon, Jun 8, 2020
                    at 12:52 PM Mark Allman <mallman@xxxxxxxx> wrote:

                  
                    Hi Stewart, et.al.!

                    
                    I just submitted a new version of rto-consider. 
                    Please ask the

                    datatracker for diffs between this and rev -14.  The
                    highlights:

                    
                      - The diffs with the last rev are here: https://tools.ietf.org/rfcdiff?difftype=--hwdiff&url2=draft-ietf-tcpm-rto-consider-15.txt

                  
          In the general case, delay across a
    network path depends not only on distance, but also a number of
    variable components such as the route and the level of buffering in
    intermediate devices.
          

          Its is more the contending/conflicting traffic
            rather than the buffering, or perhaps the time spent in
            queues, but “buffering” is a link a transport colloquial
            term.
          

          GF: The word being sought might be "queueing" (I
            think that buffering is thought of as memory- and hence max
            queue).

          
            Since our wide-area network paths are best
    effort, packet loss is a regular occurrence. 
            

          No the best effort Internet experiences this.
            There ate many well engineered WAN that do not.
          

          What I am not seeing is clearer text that
            distinguishes between user traffic and “engineering” traffic
            that is used to make the network work, and between the end
            to end traffic and traffic within an AS that may be there
            for other purposes (high value service also offered by the
            provider) and WANs that are well engineered.
          

          Perhaps we could include a clearer disclaimer
            regarding the non-best-effort-internet-end-to-end traffic?
          

          You have some text on this down in section 2 but
            it is a bit buried.
          

          Perhaps something early on of the form: This
            document is specially concerned with end to end behaviour
            over the best effort Internet. As noted in section 2 it may
            not me applicable to other types of WAN, or to the  traffic
            used in affecting the operation of the Internet itself.
          

            GF: Actually, I do think a well-engineering
              WAN can be in scope of your spec. The two wrods I was
              expecting were "controlled environment" or
              "pre-provisioned" capacity, these might not see the same
              oath properties. A DC is typically regarded in transport
              specs as a "controlled environment".
          
          
             An exception to this rule is if an IETF standardized mechanism
        determines that a particular loss is due to a non-congestion
        event (e.g., packet corruption).  
            

          That is a bit heavy. It should be “a protocol”
            there than an IETF standardarized mechanism. The IETF does
            not have a monopoly on pre-blessing protocols before they
            are deployed.
          

        GF: Unsure myself what is needed - isn't this guidance for
          design of protocol mechansims?

        
                      - All small comments addressed.

                    
                      - I think we all agree that this is not a
                    one-size-fits-all

                        situation.  Rather, this document is meant to be
                    a default case.

                        So, the main action of this rev is to make that
                    point more

                        clearly.  The first paragraph in the intro is
                    new.  Also, there

                        are some more words fleshing out the context
                    more in section 2.

                        In particular, more emphatically making the
                    point that other

                        loss detectors are fine for specific cases.

                  
        As I note above from a routing and packet transport (as
          opposed to the transport layer) perspective I think we should
          more clearly recognise at the beginning the fact that this is
          for the worst case network, not for well engineered (WAN and
          DC) networks  and the mechanisms fundamental to the operation
          of the network itself.
        

                      - The first paragraph in the intro also makes
                    clear we adopt the

                        loss == congestion model (as that is the
                    conservative default,

                        not because it is always true).

                    
                      - I made one other change that wasn't exactly
                    called for, but

                        seems like an oversight.

                    
                        Previously guideline (4) said loss MUST be taken
                    as an

                        indication of congestion and some standard
                    response taken.  But,

                        this guideline has an explicit exception for
                    cases where we know

                        the loss was caused by some non-congestion
                    event.  Guideline (3)

                        says you MUST backoff.  But, it did not have
                    this exception for

                        cases where we can tell the cause.  But, I think
                    based on the

                        spirit of (4), (3) should also have these
                    words.  So, I added

                        them.

                  
        In some cases you cannot tell the cause, but it is more
          important to ignore the loss. OAM being a particularly good
          example.
        

                        Also, I swapped (3) and (4) because it seemed
                    more natural in

                        re-reading to first think about taking
                    congestion action and

                        then dealing with backoff.  I think the ordering
                    is a small

                        thing, but folks can yell and I'll put it back
                    if there is

                        angst.

                    
                    Please take a look and let me know if this helps
                    things along or

                    not.

                    
                    allman

                  
      We are getting there, but I would ask that you take the
        transport hat off and look again from an infrastructure and
        packet transport perspective.
      

      Best regards
      

      Stewart
      

    On 18/06/2020 11:00, Stewart Bryant
      wrote:

    
          On 17 Jun 2020, at 18:20, Martin Duke <martin.h.duke@xxxxxxxxx>
            wrote:
          

            Hi Stewart,
              

              If there are no further objections, I'm
                going to declare consensus.
            
            
              On Thu, Jun 11, 2020 at
                1:45 PM Martin Duke <martin.h.duke@xxxxxxxxx>
                wrote:

              
                Stewart,
                  

                  do we need more cycles for this, or is
                    draft-15 sufficient to address your concerns?
                
                
                  On Mon, Jun 8, 2020
                    at 12:52 PM Mark Allman <mallman@xxxxxxxx>
                    wrote:

                  
                    Hi Stewart, et.al.!

                    
                    I just submitted a new version of rto-consider. 
                    Please ask the

                    datatracker for diffs between this and rev -14.  The
                    highlights:

                    
                      - The diffs with the last rev are here: https://tools.ietf.org/rfcdiff?difftype=--hwdiff&url2=draft-ietf-tcpm-rto-consider-15.txt

                  
          In the general case, delay across a
    network path depends not only on distance, but also a number of
    variable components such as the route and the level of buffering in
    intermediate devices.
          

          Its is more the contending/conflicting traffic
            rather than the buffering, or perhaps the time spent in
            queues, but “buffering” is a link a transport colloquial
            term.
          

            Since our wide-area network paths are best
    effort, packet loss is a regular occurrence. 
            

          No the best effort Internet experiences this.
            There ate many well engineered WAN that do not.
          

          What I am not seeing is clearer text that
            distinguishes between user traffic and “engineering” traffic
            that is used to make the network work, and between the end
            to end traffic and traffic within an AS that may be there
            for other purposes (high value service also offered by the
            provider) and WANs that are well engineered.
          

          Perhaps we could include a clearer disclaimer
            regarding the non-best-effort-internet-end-to-end traffic?
          

          You have some text on this down in section 2 but
            it is a bit buried.
          

          Perhaps something early on of the form: This
            document is specially concerned with end to end behaviour
            over the best effort Internet. As noted in section 2 it may
            not me applicable to other types of WAN, or to the  traffic
            used in affecting the operation of the Internet itself.
          

             An exception to this rule is if an IETF standardized mechanism
        determines that a particular loss is due to a non-congestion
        event (e.g., packet corruption).  
            

          That is a bit heavy. It should be “a protocol”
            there than an IETF standardarized mechanism. The IETF does
            not have a monopoly on pre-blessing protocols before they
            are deployed.
          

                      - All small comments addressed.

                    
                      - I think we all agree that this is not a
                    one-size-fits-all

                        situation.  Rather, this document is meant to be
                    a default case.

                        So, the main action of this rev is to make that
                    point more

                        clearly.  The first paragraph in the intro is
                    new.  Also, there

                        are some more words fleshing out the context
                    more in section 2.

                        In particular, more emphatically making the
                    point that other

                        loss detectors are fine for specific cases.

                  
        As I note above from a routing and packet transport (as
          opposed to the transport layer) perspective I think we should
          more clearly recognise at the beginning the fact that this is
          for the worst case network, not for well engineered (WAN and
          DC) networks  and the mechanisms fundamental to the operation
          of the network itself.
        

                      - The first paragraph in the intro also makes
                    clear we adopt the

                        loss == congestion model (as that is the
                    conservative default,

                        not because it is always true).

                    
                      - I made one other change that wasn't exactly
                    called for, but

                        seems like an oversight.

                    
                        Previously guideline (4) said loss MUST be taken
                    as an

                        indication of congestion and some standard
                    response taken.  But,

                        this guideline has an explicit exception for
                    cases where we know

                        the loss was caused by some non-congestion
                    event.  Guideline (3)

                        says you MUST backoff.  But, it did not have
                    this exception for

                        cases where we can tell the cause.  But, I think
                    based on the

                        spirit of (4), (3) should also have these
                    words.  So, I added

                        them.

                  
        In some cases you cannot tell the cause, but it is more
          important to ignore the loss. OAM being a particularly good
          example.
        

                        Also, I swapped (3) and (4) because it seemed
                    more natural in

                        re-reading to first think about taking
                    congestion action and

                        then dealing with backoff.  I think the ordering
                    is a small

                        thing, but folks can yell and I'll put it back
                    if there is

                        angst.

                    
                    Please take a look and let me know if this helps
                    things along or

                    not.

                    
                    allman

                  
      We are getting there, but I would ask that you take the
        transport hat off and look again from an infrastructure and
        packet transport perspective.
      

      Best regards
      

      Stewart
      

      _______________________________________________
tcpm mailing list
tcpm@xxxxxxxx
https://www.ietf.org/mailman/listinfo/tcpm

    
-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call

Subject:	Re: [tcpm] Genart last call review of draft-ietf-tcpm-rto-consider-14
Date:	Thu, 18 Jun 2020 11:00:15 +0100
From:	Stewart Bryant <stewart.bryant@xxxxxxxxx>
To:	Martin Duke <martin.h.duke@xxxxxxxxx>
CC:	tcpm <tcpm@xxxxxxxx>, Review Team <gen-art@xxxxxxxx>, Mark Allman <mallman@xxxxxxxx>, Last Call <last-call@xxxxxxxx>, Stewart Bryant <stewart.bryant@xxxxxxxxx>, tom petch <daedulus@xxxxxxxxxxxxx>, draft-ietf-tcpm-rto-consider.all@xxxxxxxx