Re: [Last-Call] [Tsv-art] [Lwip] Tsvart last call review of draft-ietf-lwig-minimal-esp-06

Bob Briscoe <ietf@xxxxxxxxxxxxxx> · Tue, 28 Sep 2021 22:24:59 +0100



    Daniel,

    
    I don't think an area reviewer has to 'approve' that your changes
    are satisfactory. But I will take a quick look. I think WG chairs,
    the doc shepherd, the IESG, or whoever it may concern might look at
    comments in area reviews to check they have been addressed, but I
    don't think this is meant to be a heavyweight process that you have
    to 'pass'.

    
    Pls see inline, tagged [BB]...

    
    On 24/09/2021 15:04, Daniel Migault
      wrote:

    
      Hi Bob, 
        

        I am willing to publish the changes on the git. Let me know
          if you are planning to do a pull request or if you have any
          additional comments. 
        

        Yours, 

          Daniel
      
      
        On Thu, Sep 2, 2021 at 5:37 PM
          Daniel Migault <mglt.ietf@xxxxxxxxx> wrote:

        
            Hi Bob, 
            

            Thanks for the careful review. I am responding inline
              to your comments. I have uploaded the version with the
              changes here:
            https://github.com/mglt/draft-mglt-lwig-minimal-esp/commit/e20ac927b8d0f41e91868a9c2bb81a765f337fbe

            
            Yours, 

              Daniel
            

              On Wed, Sep 1, 2021 at
                7:54 PM Bob Briscoe via Datatracker <noreply@xxxxxxxx> wrote:

              
              Reviewer: Bob Briscoe

                Review result: On the Right Track

                
                This document has been reviewed as part of the transport
                area review team's

                ongoing effort to review key IETF documents. These
                comments were written

                primarily for the transport area directors, but are
                copied to the document's

                authors and WG to allow them to address any issues
                raised and also to the IETF

                discussion list for information.

                
                When done at the time of IETF Last Call, the authors
                should consider this

                review as part of the last-call comments they receive.
                Please always CC

                tsv-art@xxxxxxxx if you
                reply to or forward this review.

                
                ==A) General Comments==

                
                ===A.1) Caveat Regarding Transport Area Expertise ===

                
                Although this review is on behalf of the transport area
                review team, I don't

                claim to know much about the interactions between IPsec
                and the transport layer

                (e.g. NAT traversal, etc), which is not my area of
                expertise.

                
                ===A.2). The document doesn't do what the Title,
                Abstract and Intro say===

                
                Abstract

                   This document describes a minimal implementation of
                the IP

                   Encapsulation Security Payload (ESP) defined in RFC
                4303.  Its

                   purpose is to enable implementation of ESP with a
                minimal set of

                   options to remain compatible with ESP as described in
                RFC 4303.

                
                Introduction

                   ...This document describes the minimal properties an
                ESP

                   implementation needs to meet to remain interoperable
                with [RFC4303]

                   ESP.  In addition, this document also provides a set
                of options to

                   implement these properties under certain constrained
                environments.

                
                The draft doesn't define a minimal implementation
                (singular). It gives

                considerations that might help minimize each of many
                application-specific

                implementations (plural).

                
                Given the title, abstract and introduction, the reader
                expects a draft that

                defines a single clear profile that is a subset of the
                generic ESP protocol -

                perhaps like minimal IKEv2 [RFC7815] is a subset of
                IKEv2 [RFC7296]. That

                expectation rapidly  leads to disappointment. Indeed,
                the style of this draft

                is quite the opposite from a single recommended
                implementation. The draft's

                style is not even clear-cut conditional statement like
                "in scenario X do Y".

                The style is more like "Think carefully about this, that
                and the other".

                
                So the word "Considerations" definitely ought to be
                included in the title.

                Perhaps "Considerations for Minimizing Encapsulation
                Security Payload (ESP)

                Implementations" (also note the abbreviation is
                expanded).

              
    [BB] Title: Pls consider expanding the ESP abbreviation. 

    If this were about TCP, DNS, IPsec, BGP, HTTP, they certainly
    wouldn't need expansion. 

    But ...ESP? ...nyuh.

    
              I got your point,  overall the document details how
                to implement ESP with the minimum functionalities
                to remain compatible with 4303 which I think is a
                minimal ESP. The document also describes how in a
                constrained world these functions can be implemented. I
                think we should clarify the protocol and implementation
                aspects in the abstract as follows:https://datatracker.ietf.org/doc/html/rfc793
               
              
              Abstract:
              """
              This document describes a
                    minimal IP Encapsulation Security Payload (ESP)
                    defined in RFC 4303. Its purpose is to enable
                  implementation of ESP with a minimal set of options to
                  remain compatible with ESP as described in RFC 4303.
                                           

                  A minimal version of
                    ESP is not intended to become a replacement of the
                    RFC 4303 ESP.
              Instead, a minimal
                    implementation is expected to be optimized for
                    constrained environment while remaining
                  interoperable with implementations of RFC 4303 ESP.
                               

                  In addition, this
                    document also provides some considerations to
                    implement minimal ESP in a constrained
                  environment which includes limiting the number of
                  flash writes, handling frequent wakeup / sleep states,
                  limiting wakeup time, or reducing the use of random
                  generation.          
                                                                         
              This document does not
                    update or modify RFC 4303, but provides a compact
                    description of how to implement the minimal
                  version of the protocol. RFC 4303 remains the
                    authoritative description.

                
              """
              

              For the
                  introduction I do not believe this should be clarified
                  and it seems the following text is clear enough. Is
                  there anything that we should clarify ?
              

              """
              This document describes the
                    minimal properties an ESP implementation needs to
                    meet to remain interoperable with <xref
                  target="RFC4303"/> ESP.
                                                 

                  In addition, this
                    document also provides a set of options to implement
                    these properties under certain constrained
                  environments.
                                                             

                  This document does not
                    update or modify RFC 4303, but provides a compact
                    description of how to implement the minimal
                  version of the protocol.                              

                  RFC 4303 remains the
                    authoritative description.  

                
              """
              

    [BB] Abstract: The doc really does not describe just one minimal
    ESP. Why write an abstract that describes something different to
    what the document is?

    
    Intro: Agree it's OK.

    
               ===A.3).
                Interoperability===

                
                After Nancy Cam-Winget's review of -03, the sentence in
                the introduction was

                clarified that interop meant interop with full RFC4303
                implementations

                
                   ... to remain interoperable with [RFC4303] ESP.

                
                However, in the sections on specific fields, I believe
                there are cases where

                interop seems to be restricted to an app-specific ESP
                peer rather than any

                general RFC4303 peer. For instance: * §2 suggests use of
                an SPI that includes
               the memory address
                of the SAD structure, but it is not spelled out how the

                remote (possibly vanilla RFC4303) peer will know to
                apply an SPI to packets

                that will have the appropriate value for the local peer
                to locate in its SAD,

                given the SPI set by the remote peer is the local peer's
                inbound SPI. 
              

              Well the remote peer does not need to know how you
                generate the SPI nor how you handle the SPI. In this
                example, you tell the peer to use the SPI that
                corresponds to your memory address. The peer will use
                that SPI for your incoming packet. Upon receiving the
                packet you interpret the SPI as a memory address. Note
                that for outgoing packets, the lookup is based on the
                traffic selectors - that is the clear text packet.
            
          
    [BB] OK - I'll concede that one :)

    But I'm sure there were other examples that seemed app-specific
    (like the next one), but I'll leave you to check them all.

    
              * In §3

                it is remarked that the sender's use of time to generate
                the SN would require

                the receiver to be appropriately configured, which
                implies the receiving peer

                would somehow have to know the typical size of the
                sending peer's SN increments

                to configure an appropriate window size.

                
              I see your point but note that 4303 only requires an
                only increasing function to be used to generate the SN.
                While in some cases this might lead to sub optimal
                configuration this still provides interoperability.
                Suppose that one peer defines a window size to be set to
                X. This means that it is willing to accept no more than
                X packets. Using a packet counter you may reach that
                number, which is unlikely to be the case when you use
                time. But in any case you do not exceed the window size.
                You may have a few additional packets to retransmit
                though.
            
          
    [BB] When using time, the range of seq numbers covered for say 100
    packets would typically be significantly more than 100 (see later
    about irregular inter-packet time intervals). 

    When you say there would be a few retransmissions, does that imply
    that you're thinking that a sequence of packet timestamps would map
    to SNs that sometimes collide and then cause false positive replay
    detection. Surely you wouldn't want the granularity of the mapping
    between time and SNs to be that poor?

    
               ===A.4)
                Completeness===

                
                The IPsec protocol fields are used as the framework to
                hang the document

                contents from. But surely the data plane is not the only
                part of an IPsec ESP

                implementation? What about restricting the range of
                valid values in an SA

                itself, to reduce complexity? What about reducing
                complexity of the SAD

                (mentioned in §2, but not addressed specifically in its
                own right)? What about

                simplifying the management interface? There's no mention
                of simplifying (or

                eliminating) auditing, which is optional in RFC4303.

                
                I am not involved in the security area, so apologies if
                all the above is dealt

                with in other lwig drafts.

              
              Well, IPsec defines an architecture and ESP is one
                protocol used to transmit data, so in essence it is
                restricted to the data plane. The document considers
                cases when the device is provisioned with a long term
                key, but in general we recommend IKEv2 to configure ESP.
                The IPsec architecture does not define how to configure
                ESP. Only recently a YANG model has been defined.
              Reducing the complexity of the SAD seems to me out of
                scope of the document.
            
          
    [BB] That's what I'm asking - the document needs to say what is and
    isn't in scope - and why. 

    Why is the SAD out of scope? It's what ESP uses for every packet
    lookup. Streamlining the complexity of the SAD is surely likely to
    significantly help a constrained device.

    Why is auditing out of scope? Not supporting auditing would
    considerably lighten a constrained implementation.

    All the code to supporting an interface to manage the device is
    likely to take up more memory than the data plane, unless the full
    extent of configuration and management flexibility is cut to the
    bone.

    
                ===A.5) 'It is recommended' ambiguity===

                
                There are 5 occurrences of this ambiguous phrase. The
                first two examples (at

                least) are ambiguous:

                
                   For nodes supporting only unicast communications, it
                is recommended

                   to index SA with the SPI only.

                ...

                   For inbound traffic, it is recommended that any
                receiver provides

                   anti-replay protection,

                
                Do these mean that RFC4303 recommends these things, or
                that the present draft

                recommends them for minimal implementations? Pls check
                the other 3 occurrences

                too.

              
              recommendations are from the document, but closely
                rephrasing 4303. 4303 says:
              

                 The SPI has a local significance to index the Security Association
   (SA).  From [RFC4301] section 4.1, nodes supporting only unicast
   communications can index their SA only using the SPI.  On the other
   hand, nodes supporting multicast communications must also use the IP
   addresses and thus SA lookup needs to be performed using the longest
   match.
                and
                The SN is set by the sender so the receiver can implement anti-replay
   protection.  The SN is derived from any strictly increasing function
   that guarantees: if packet B is sent after packet A, then SN of
   packet B is strictly greater then the SN of packet A.
In the early version of the document we used RECOMMENDED but we have been asked to use lower letters as the document is informational. I am wondering if the confusion comes from the use of lower letters.
              
            
    [BB] No, the confusion comes from the use of the passive, which
    doesn't say what is doing the recommending: this draft or RFC4303?
    Please just say either "the present document recommends" or "RFC4303
    recommends", whichever you mean.

    
               ===A.6) 'May not'
                ambiguity ===

                
                There's 5 occurrences of this too. I think only the
                first 2 are ambiguous.

                
                  ... nodes designed to only send data may not implement
                this capability.

                
                   ...a minimal ESP implementation may not

                   generate such dummy packet.

                
                Pls avoid it. In English, 'may not' can either mean
                'might not' or 'must not',

                depending where the emphasis is.

              
              Thanks, the meaning here is "might not". A "must not"
                would update 4303. I will leave it as it is until we are
                sure we do not re capitalize the text, but if it remains
                with lower letters I will check with my ADs / rfc editor
                what is most appropriate. 

              
    [BB] Pls do not leave it as it is. If it means might not, say might
    not. 

    It is coincidental that we're talking about two words (may and must)
    that are sometimes capitalized. My point was nothing to do with
    whether these were normative. The problem is purely that the phrase
    "may not" should be avoided in English because it is invariably
    ambiguous.

    
                ===A.7) Try to avoid words that would be normative if
                upper case===

                
                It's up to the authors, but I would advise against using
                must, should, may etc.

                in lower case in any RFC, given readers sometimes
                imagine that they might have

                been intended to mean MUST, SHOULD, or MAY. Suggested
                lower case alternatives:

                have to, ought to, and might.

              
              In our case that is how they came. I will check with
                our AD/IESG what their preference is. My
                personal preference would be upper case, than lower case
                - I think for the exact reasons you raised. Thanks for
                providing the alternatives, that is useful. 
              

                ==B) Specific Technical Points==

                ===B.1) Vague sentence in §2===

                
                   Some other local constraints

                   on the node may require a combination of the SPI as
                well as other

                   parameters to index the SA.

              
              I propose the following change:
              OLD:
              """
              For nodes supporting only
                    unicast communications, it is recommended to index
                    SA with the SPI only.       

                  The index may be based
                    on the full 32 bits of SPI or a subset of these
                    bits.          

                  Some other local
                    constraints on the node may require a combination of
                    the SPI as well as other parameters to index
                  the SA.

                
              """
              

              NEW:
              
                """
                For nodes supporting only
                      unicast communications, it is recommended to index
                      SA with the SPI only.       

                    The index may be
                      based on the full 32 bits of SPI or a subset of
                      these bits.          

                    The node may require
                      a combination of the SPI as well as other
                    parameters (like the IP address) to index the SA.

                  
                """
                

                ===B.2) Last para of §2.1 ends up in the air===

                
                I think this para is trying to say that there's no need
                to ensure that SPI

                generation is properly random if a privacy analysis of
                the application doesn't

                require this. It falls short, needing a final sentence
                to actually say this, if

                that is what it intends to say.

                
              ok I propose the following changes:
              

              OLD:
              While the use of randomly
                    generated SPIs may reduce the leakage or privacy of
                    security related information by ESP itself,
                  these information may also be leaked otherwise and a
                  privacy analysis should consider at least the type of
                  information as well the traffic pattern.        

                
              NEW:
              While
                  the use of randomly generated SPIs may reduce the
                  leakage or privacy of security related information by
                  ESP itself, these information may also be leaked
                  otherwise.
              As a result, a privacy
                analysis should consider at least the type of
                information as well the traffic pattern before
                determining non random SPI can be used. 

              
               Rather than make the
                controversial point that the state of a sensor reporting

                the open/closed state of a door doesn't typically leak
                privacy info, the

                controversy can be avoided. All that is needed is to say
                that /if/ messages for

                a particular application are not privacy sensitive, a
                randomized SPI is not

                necessary.

                
              I think the new text makes it clearer. The purpose of
                the door is to provide an illustrative example, that
                proves the existence of such situation.
            
          
    [BB] My point was that, given this is only an example, you don't
    need to say that the state of a door doesn't typically leak privacy
    info, which is a controversial statement (for instance I don't think
    I could agree with that). But it doesn't matter whether it's true,
    you don't need to say it. All you need to say is "for instance, if
    the state  of a sensor reporting the open/closed state of a door
    doesn't leak privacy info, a randomized SPI is not necessary."

    
               BTW, this does beg
                the question whether the implementer can foresee the

                different trust environments an application might be
                used in (e.g a door

                open/closed message might have no privacy implications
                for one application, but

                then a more ambitious application might be built on top
                of the original app).

                
              Application in constrained environment are by
                definition very specific, as such non constrained
                applications  developers are more likely to base their
                development on more generic implementation, unless they
                have a strong willingness to make it wrong.
              

               In summary, it's a
                moot point whether an app developer can be expected to

                properly analyse privacy vulnerabilities, or whether
                it's just better to play

                safe with a randomized SPI.

              
              I think the text clearly says random SPIs are
                recommended - which I think goes along what you are
                mentioning. The other way around is that for device that
                cannot generate these randoms. In that case it seems
                safer to provide a mean they can use ESP to protect
                their communication that they do not implement any
                security. 
              

                ===B.3) Use of time-driven SN still requires sequence
                state===

                
                   When the use of a

                   clock is considered, one should take care that
                packets associated

                   with a given SA are not sent with the same time
                value.

                
                To check that no two time values are the same requires
                holding the same state

                as would be needed to increment a counter - taking us
                back to square 1.

              
              This is enforced by design of your application and of
                course there are application this does not work.  This
                is left as an option.
            
          
    [BB] The doc needs to say that then. For example, "one should only
    consider use of a clock to generate SNs if the application will
    inherently ensure that no two packets with a given SA are sent with
    the same time value. Otherwise sequence state will be required,
    defeating the benefit of using a clock."

    
                ===B.4) Other problems with time-driven SN===

                
                §3 says

                
                   ...if not appropriately configured, the use of a

                   significantly larger SN may result in the packet out
                of the

                   receiver's windows and that packet being discarded.

                
                BTW, I think that is meant to say 'a significantly
                larger SN *increment* may

                result in...' Assuming that's what was meant, it's not
                just a question of a

                standard receiver being 'appropriately configured', but
                also whether standard

                receiver implementations would be even capable of
                supporting a much larger

                anti-replay window, which would require receivers to
                manage a significantly

                larger bit-map in memory.

              
              I think increment can mean counter +=1 and so could
                be misleading as opposed to using larger, so I prefer to
                keep larger unless this is not what needs to be
                clarified.
            
          
    [BB] OK, use "SN difference" then. But pls don't leave it as it is -
    surely you don't mean that the size of each sequence number itself
    is large.

    
              Again, there are cases this is not appropriated and
                we are not recommending to implement this everywhere.
                That said, there is no special need for the receiver if
                the receiver has a window size of 1000 and is using a
                packet count for anti-replay. The sender using time will
                makes the windows size as packets sent in the last 1000
                ms.   

              
    [BB] You seem to be imagining that packets are sent at absolutely
    regular intervals. That never happens. Imagine the designer has to
    allow for packets sent at the following monotonically increasing μs
    times, and the smallest delta is known to be 50μs. Then the best
    stateless translation into SNs will produce the following sequence:

    14568 -> 0

    15243 -> 14

    15293 -> 15

    16897 -> 47

    18432 -> 77

    18743 -> 84

    
    This has expanded the size of replay window needed by about 16x. Of
    course, the application might not be so irregular but, when mapping
    timestamps to SNs, there will always be some expansion in the
    required window size due to a degree of jitter.

    
    This tradeoff between statelessness and increased window size needs
    to be explained. Personally I cannot imagine /any/ case where saving
    one integer of state to avoid SN duplication could ever be worth the
    cost of the state needed to hold an inflated replay window, which is
    bound to be greater than one int. If such a tradeoff makes sense to
    you, then it surely needs to be explained (not to me in this email,
    but in the doc).

    
                Also, for devices that spend significant time sleeping,
                the SN would jump

                hugely on first waking. That shouldn't require any
                larger window (unless a

                stale packet from prior to the sleep was only released
                after a new packet on

                waking). But the receiver would need to be able to
                somehow detect massive jumps

                in the high order bits that are not communicated in the
                SN field.

              
              Small jumps or massive jumps have no impact on the
                windows size. The windows size only tells the packets
                that will not be rejected. These packets are those with
                the SN between LAST_SN and LAST_SN - WINDOW_SIZE.
            
          
    [BB] When a massive jump wraps the ESN, surely the SN could collide
    with one already recorded in the anti-replay window?

    
                ===B.5) The need for anti-replay protection===

                
                §3 says:

                
                   Receiving peers may also

                   implement their own anti-replay mechanism.

                
                Assuming this is meant to mean "Receiving *applications*
                may also...", it is

                true. However, as I understand it, IPsec provides a
                general anti-replay

                facility for all the cases where anti-replay is not done
                e2e in higher layers.

                Indeed, most e2e security protocols prevent replay, but
                most application logic

                does not, and sometimes, just sometimes, this might open
                up a vulnerability .

                
              IPsec anti replay mechanism only apply to the IP
                layer. I understand application replay as the same query
                being sent. If the query is in a different packet IPsec
                does not help.
              That said in IoT world, device and applications might
                be the same entity, in which case the application may
                leverage from IPsec anti-replay protection. But in
                general application logic should provide some protection
                against anti-replay. I think we agree on this. Now I am
                trying to understand if there is anything that could be
                added to the document, but I think that applications
                considerations are a bit out of scope of the document.
            
          
    [BB] I guess it's not clear why the sentence I quoted is there in
    the draft. It's perhaps hinting that IPsec anti-replay might not be
    needed? Which is why I pointed out that IPsec anti-replay is really
    doing a different job. So, if this sentence is used, it should be
    clear what implication is intended.

    
               ===B.6) TFC more
                appropriate for IoT?===

                
                One could argue that Traffic Flow Confidentiality (§4)
                is *more* applicable for

                large classes of IoT applications than other
                'traditional' applications of

                IPsec.  Many IoT apps send only a few different types of
                message, which are

                perhaps more likely to be distinguishable from the sizes
                of the encrypted

                messages. This point is made in the draft, but not
                clearly enough:

                
                   In

                   most cases, the payload carried by these nodes is
                quite small, and

                   the standard padding mechanism may also be used as an
                alternative to

                   TFC,

                
                This comes just after the draft has recommended that TFC
                is *not* used (because

                standard devices haven't adopted it). But it's not
                really clear here that it is

                recommending that TFC ought to be used in these cases
                with a limited set of

                messages, and that 32-bit alignment will often be a
                sufficient mechanism.

                Indeed the draft continues, saying it is preferred to
                poll, rather than send

                only when an event occurs. That is surely a form of TFC
                as well.

                
                If anything, the draft ought *not* to recommend against
                TFC, which is only on

                the spurious grounds than non-IoT applications have
                tended not to need it.

                
              Padding/dummy packet are likely to remain sufficient.
                If that is not the case, and that TFC is needed it is
                likely that we do not have a constrained device as we
                are trying hard to limit the number of frames sent for
                IoT devices. It is also really hard to configure TFC
                properly, so currently implementing it is likely to
                create a bias that will detect the application in
                question and as such loose a lot of the expected
                privacy. 

              
    [BB] Apart from the first sentence these are different points. They
    basically say TFC doesn't work. Unfortunately, the problem TFC was
    meant to solve still exists, whether TFC works or not. The draft
    should not translate "TFC doesn't work" into "A working TFC isn't
    needed".

    
               ===B.7) Sleep as
                well as reboot===

                
                §8 on Security Considerations discusses using session
                keys across reboots. I

                think it would appropriate to talk about across sleep
                periods as well.

              
              A SA is not affected by a sleep. It does not make any
                deference the devices sleep or not and more importantly
                IV are not repeated - by design. Reboot clear the SA
                context and might repeat the IV. Unless I am missing
                anything I do not see the concern with sleep.
            
          
    [BB] OK.

    
    Bob

    
                ==C) Editorial Nits==

                
                I found loads. I'll send an edited XML file if time
                permits. Otherwise, we'll

                have to trust that the RFC Editor will find them and fix
                them all.

                
                Regards

                
                Bob Briscoe

                
                _______________________________________________

                Lwip mailing list

                Lwip@xxxxxxxx

                https://www.ietf.org/mailman/listinfo/lwip

              
            -- 

            
                Daniel Migault

                
                Ericsson
              
            
      -- 

      
          Daniel Migault

          
          Ericsson
        
      
      _______________________________________________
Tsv-art mailing list
Tsv-art@xxxxxxxx
https://www.ietf.org/mailman/listinfo/tsv-art

    
    -- 
________________________________________________________________
Bob Briscoe                               http://bobbriscoe.net/
  

-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call