Re: [Last-Call] Tsvart telechat review of draft-ietf-sfc-oam-framework-13

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Frank,

Thank you for the review. Please see inline for the response..


    Reviewer: Frank Brockners
    Review result: Ready with Nits
    
    This document has been reviewed as part of the transport area review team's
    ongoing effort to review key IETF documents. These comments were written
    primarily for the transport area directors, but are copied to the document's
    authors and WG to allow them to address any issues raised and also to the IETF
    discussion list for information.
    
    When done at the time of IETF Last Call, the authors should consider this
    review as part of the last-call comments they receive. Please always CC
    tsv-art@xxxxxxxx if you reply to or forward this review.
    
    This document provides a reference framework for OAM for SFC.
    
    Comments:
    
    Section 3.1.1 SF availability: The text makes explicit reference to multiple
    instances of a SF. Consequently, it should be defined how availability of a SF
    is computed/determined in case multiple instances are deployed. 

<Nagendra> This is already clarified in the section as below:

"For cases where
   multiple instances of an SF are used to realize a given SF for the
   purpose of load sharing, SF availability can be performed by checking
   the availability of any one of those instances, or the availability
   check may be targeted at a specific instance."

This further
    leads to the question, whether availability is always a "binary" state
    (available / not-available), or could a SF be e.g. 99% available? 

<Nagendra>The availability is measured as binary state. I am not sure what is 99% available. If it means getting 99 responses for 100 probes sent, I think it falls under packet loss category which in turn is performance measurement.

Section 3.1.2
    SF performance: What is the impact of a "multiple instance SF deployment" on SF
    performance measurement? 

<Nagendra>I think we covered this in SF availability but not here. Does the below updated text look better?

OLD:
On the one hand, the performance of any specific SF can be quantified
   by measuring the loss and delay metrics of the traffic from SFF to
   the respective SF, while on the other hand, the performance can be
   measured by leveraging the loss and delay metrics from the respective
   SFs.  The latter requires SF involvement to perform the measurement
   while the former does not.

NEW:
On the one hand, the performance of any specific SF can be quantified
   by measuring the loss and delay metrics of the traffic from SFF to
   the respective SF, while on the other hand, the performance can be
   measured by leveraging the loss and delay metrics from the respective
   SFs.  The latter requires SF involvement to perform the measurement
   while the former does not. For cases where
   multiple instances of an SF are used to realize a given SF for the
   purpose of load sharing, SF performance can be quantified by measuring 
   the metrics for any one instance of SF or by measuring the metrics for 
   a specific instance.

The section only talks about loss and delay as
    performance criteria. It would be good to state that other performance criteria
    (e.g. specific to the SF, throughput, etc.) exist. 

<Nagendra> We can add the below to Section 3.1.2:

NEW:
"The metrics measured to quantify the performance of the SF component is 
not just limited to loss and delay. Other metrics such as throughout also exist 
and the choice of metrics for performance measurement is outside the scope 
of this document."

Section 3.2.1 SFC
    availability: The current definition is very focused on connectivity
    verification, i.e. it tries to answer the question: "Does my SFC transport
    packets?". IMHO we should also ask the question "Does my SFC process the
    packets correctly?" - because if packets are not processed per the SFC
    definition, we might not call the SFC available. 

<Nagendra> I think this is already handled by SF availability. The end-to-end SFC availability is verified by steering the OAM packet over the ordered set of SFs within the SFC. This is more like daisy chaining the availability of SFs within the SFC to determine end-to-end SFC availability. If the derived solution verifies the SF availability not just based on the uptime but based on the service treatment, it also answers the question "Does my SFC process the packets correctly". Let us know if there is any further clarity required. 

While 3.2.2 states that "any
    SFC-aware network device should have the ability to make performance
    measurements" a similar statement isn't found in 3.2.1. IMHO the ability for
    availability checks is probably a prerequisite for performance measurement.

<Nagendra> The ability to perform end-to-end or partial SFC availability verification is already mentioned in section 3.2.1 as below:

" In order to perform service connectivity verification of an SFC/SFP,
   the OAM functions could be initiated from any SFC-aware network
   devices of an SFC-enabled domain for end-to-end paths, or partial
   paths terminating on a specific SF, within the SFC/SFP"

Please let us know if you have any suggestion to improve if there is a lack of clarity.

    Section 3.2.2 SFC performance measurement: The section only mentions the need
    for performance measurement. It misses the definition of what SFC performance
    measurement is. 

<Nagendra>

Section 3.3. Classifier component: The section mentions the
    need for the ability to perform performance measurement of the classifier
    component. What is performance measurement of the classifier? What does
    performance measurement of the classifier component comprise? 

<Nagendra>We can add the below text:

OLD:
Any SFC-aware network device should have the ability to perform
   performance measurement of the classifier component for each SFC.

NEW:
Any SFC-aware network device should have the ability to perform
   performance measurement of the classifier component for each SFC.
    The performance can be quantified by measuring the performance metrics of the 
     traffic from the classifier for each SFC/SFP.

Section 3.4. /
    3.5. Availability/PM of the underlay and overlay network: It would be good to
    add a sentence that states that the mechanisms for availability/PM which are
    offered by the technologies used by the overlay/underlay are used, rather than
    new methods specifically for SFC would be defined. 

<Nagendra>Yes, that makes sense. Please check the below text:

OLD:
Any SFC-aware network device may have the ability to perform
   availability check or performance measurement of the overlay network.

NEW:
Any SFC-aware network device may have the ability to perform
   availability check or performance measurement of the overlay network. Any
   existing OAM tools and techniques can be leveraged for this purpose.

Section 4. SFC OAM
    Functions: It would be good, if examples in section 4 could also include more
    "recent" methods such as OWAMP/TWAMP (RFC4656, RFC 5357). 

<Nagendra> 

OLD:
Delay within an SFC could be measured based on the time it takes for
   a packet to traverse the SFC from the ingress SFC node to the egress
   SFF.  As SFCs are unidirectional in nature, measurement of one-way
   delay [RFC7679] is important.  In order to measure one-way delay,
   time synchronization MUST be supported by means such as NTP, PTP,
   GPS, etc.

NEW:
Delay within an SFC could be measured based on the time it takes for
   a packet to traverse the SFC from the ingress SFC node to the egress
   SFF.  Measurement protocols such as One-way Active Measurement 
    Protocol (OWAMP) [RFC4656], Two-way Active Measurement Protocol
   (TWAMP) [RFC5357] can be used to measure the characteristics. As 
   SFCs are unidirectional in nature, measurement of one-way
   delay [RFC7679] is important.  In order to measure one-way delay,
   time synchronization MUST be supported by means such as NTP, Precision Time Protocol (PTP),
   GPS, etc.

Section 4.4.
    Performance Measurement: Focus is entirely on the PM of the connectivity,
    rather than on the SF. How about covering PM for the SF as well? 

<Nagendra> I am not sure I understand what is missing. Do you have any suggestion for the text improvement?.

Section 5.1
    OAM Tool Gap Analysis:
     - Not sure what "NVo3 OAM" refers to. Could that be explained below the table
     and in section 1.2.1? 

<Nagendra> Combining this with other below queries as they appears to be related.

- E-OAM needs to be detailed. Is seems that CFM
     (802.1ag) and not 802.3ah is refered to here. 

<Nagendra> Per my understanding, 802.ah is 1-hop while 802.3ag can be more than 1 hop and both uses Ethernet frames. So I think both are applicable here. My response regarding E-OAM details in this section is combined below.

- "Trace" in the "Trace" column
     need to be extended on. Is this traceroute? Paris-Traceroute? IOAM-Loopback? 

     IPPM needs to be detailed, because IPPM is not a tool as such but an IETF WG.
     Does this refer to OWAMP/TWAMP/etc. as defined by IPPM?
    
<Nagendra> Combining the above queries. 

OLD:
There are various OAM tool sets available to perform OAM functions
   within various layers.  These OAM functions may be used to validate
   some of the underlay and overlay networks.  Tools like ping and trace
   are in existence to perform connectivity check and tracing of
   intermediate hops in a network.  These tools support different
   network types like IP, MPLS, TRILL, etc.  There is also an effort to
   extend the tool set to provide connectivity and continuity checks
   within overlay networks.  BFD is another tool which helps in
   detecting data forwarding failures.  Table 3 below is not exhaustive

NEW:
There are various OAM tool sets available to perform OAM functions
   within various layers.  These OAM functions may be used to validate
   some of the underlay and overlay networks.  Tools like ping and trace
   are used to perform connectivity check and tracing of
   intermediate hops in a network.  These tools are already available for
   different types of networks such as IP, MPLS, TRILL, etc. 
 
E-OAM offers OAM mechanisms such as an Ethernet continuity check for 
Ethernet links. There is an effort around NVO3 OAM
to provide connectivity and continuity checks for networks that use NVO3.  BFD is used
for the detection of data plane forwarding failures.
 
The IPPM framework [RFC 2330] offers tools such as OWAMP [RFC4656] and TWAMP
[RFC5357] (collectively referred as IPPM in this section) to measure various performance
metrics. MPLS Packet Loss Measurement (LM) and Packet Delay Measurement (DM) (collectively
referred as MPLS_PM in this section) [RFC6374] offers the ability to measure
performance metrics in MPLS network.

Table 3 below is not exhaustive.

Section 6.4.3 IOAM:
    - The section states that IOAM "may be used to perform various SFC OAM
    functions as well". It would be good to expand on this statement: E.g. IOAM
    Trace-Option Type could be leveraged for SFC tracing. IOAM Direct-Export Option
    Type could be leveraged. - How would we deal with the IOAM Active Flag
    (draft-ietf-ippm-ioam-flags-01) when used with SFC OAM? 

<Nagendra> The intention of the section is to highlight the applicability of different OAM toolsets for OAM functions at service layer. I am not sure if we really should try explaining all the possible options within each tool. But I agree that it is worth clarifying the availability of IOAM options for tracing. think we can clarify that different IOAM Option-Types are available for OAM functions such as SFC tracing. Can you check if the below looks ok?

OLD:
[I-D.ietf-sfc-ioam-nsh] defines how In-Situ OAM data fields are
   transported using NSH header.  [I-D.ietf-sfc-proof-of-transit]
   defines a mechanism to perform proof of transit to securely verify if
   a packet traversed the relevant SFP or SFC.  While the mechanism is
   defined inband (i.e., it will be included in data packets), it may be
   used to perform various SFC OAM functions as well.

NEW:
[I-D.ietf-sfc-ioam-nsh] defines how In-Situ OAM data fields are
   transported using NSH header.  [I-D.ietf-sfc-proof-of-transit]
   defines a mechanism to perform proof of transit to securely verify if
   a packet traversed the relevant SFP or SFC.  While the mechanism is
   defined inband (i.e., it will be included in data packets), IOAM Option-Types
  such as IOAM Trace Option-Types can also be used to perform other SFC OAM function 
  such as SFC tracing.

- The text states
    "In-Situ OAM could be used with O bit set": Why would IOAM be used with the
    overflow bit set for SFC OAM? For details on IOAM's O-bit, see section 4.4.1 in
    https://tools.ietf.org/html/draft-ietf-ippm-ioam-data-09. 

<Nagendra> The O bit referred here is not the O bit in IOAM but the one in NSH/Overlay header. To avoid any confusion, this can be updated as below:

OLD:
In-Situ OAM could be used with O bit set to perform SF availability
   and SFC availability or performance measurement.

NEW:
In-Situ OAM could be used with O bit in the overlay header set, to perform SF availability
   and SFC availability or performance measurement.

Section 6.4.4 SFC
    Traceroute: - This section refers to an expired draft (even calling out the
    fact that the draft has exipred), but also mentions that functionality is
    available and implemented in OpenDaylight. Consider removing the references to
    the expired draft and rather add references to OpenDaylight documents. - IOAM
    Loopback (see draft-ietf-ippm-ioam-flags-01) could apply SFC Traceroute as well.
    
<Nagendra>Ok. Let me check if I can find some reference for ODL. 

    Detailed set of nits that I encountered while reading through the document ([x]
    references line number x) – hope that they are helpful in further improving the
    doc:

<Nagendra> Yes of course (.
    
    [global] s/an SF/a SF/ -- and similarly SFC/SFF

<Nagendra>Other RFCs uses "an SF/SFF". So the draft is updated accordingly. If your suggestion is to substitute "a SF" to "an SF",  it is done (.

    [176] "OAM Controller" not defined

<Nagendra>We can change it as below:

OLD:
OAM controllers are assumed to be within the same administrative
   domain as the target SFC enabled domain.

NEW:
OAM controllers are SFC-aware network devices that are capable of 
generating OAM packets. They are assumed to be within the same 
administrative domain as the target SFC enabled domain.

    [202] Why just Virtual Machines and no containers? Suggest to make things
    generic and talk about virtual and physical entities.

<Nagendra> We changed this as virtual entities.

          This comment applies throughout the document.
    [216] Ethernet OAM: Add reference. Do you refer to physical layer Ethernet OAM
    (802.3ah) or CFM (802.1ag)? 

<Nagendra> The response was provided in the above comment section.

[243] s/uses the overlay network/uses the overlay
    network layer/ 

<Nagendra> Done.

[246] Could we add a few examples of "various overlay network
    technologies"? For the underlay network layer several examples are listed.

<Nagendra> Ok.

    [248] What does "mostly transparent" mean? 

<Nagendra> The data plane elements connecting the overlay layer nodes may not always process the overlay header. 

[254] What does "tight coupling"
    between the link layer and the physical technology mean? 

<Nagendra>I am not sure I understand the nit here. Do you see any difficulty in parsing the sentence?.

[255] Suggest to avoid
    terms like "popular" - popularity can change, standards stay 

<Nagendra> Ok. This is changed as "Ethernet is one such choice..."

[256] Acronyms
    "POS" and "DWDM" are not defined 

<Nagendra> Added.

[274] Link start/end-points don't seem to
    always align with the underlay network in the diagram 

<Nagendra> Fixed it.

[287] s/may comprise
    of/may consist of/ 

<Nagendra>We fixed it as "may comprise"..

[288] s/but not shown/but is not shown/ 

<Nagendra> We fixed this as "intermediate nodes not shown...:

[307]
    s/devices/device/ 

<Nagendra> Done.

[308] What is a "controller"? 

<Nagendra> We discussed this in the above comment section.

[314] s/includes/include/ 

<Nagendra>Done.

[319]
    Add hSFC to list of acronyms in section 1.2.1 

<Nagendra> This is expanded in the respective section. We added it in the acronym section as well.

[320] Add IBN to list of acronyms
    in section 1.2.1 

<Nagendra> Ok, Done.

[325] s/includes/include/ 

<Nagendra> Done.
[359] The function/term "controller"
    requires definition. 

<Nagendra> Done, as mentioned in the above comment section.

[383] s/?./?/ 

[398] s/get the got/got/

<Nagendra> Done.

 [461]
    s/devices/device/

<Nagendra> Done.

 [469] Does it have to be equal cost multipath at the service
    layer, or could unequal cost multipath also be an option for load-balancing?
   
<Nagendra>I didn’t see any discussion specific to ECMP/UCMP in the architecture RFC.

 [521] Not sure whether the overlay network establishes the service plane. Isn't
    it that the overlay network establishes connectivity for the SFC-related
    functions in the service plane? 

<Nagendra> The service layer is established over the overlay network layer. I am not sure if it is right to say overlay network provides connectivity for service layer (.

[531] s/components/component/ [545] remove
    "underlay" 

<Nagendra> Done.

[595] s/devices/device/ 

<Nagendra> Done.

[600] s/action/an action/ 

<Nagendra> Done.

[601] Expand on
    "TTL or other means" (TTL also needs to be added to acronyms in 1.2.1). Is this
    specific to NSH? Or specific to IPv4?

<Nagendra> TTL is listed as well-known abbrev in https://www.rfc-editor.org/materials/abbrev.expansion.txt and so we left it as it is. TTL in this document refers to NSH TTL field.

 [630] Mention that for "approximation of
    packet loss for a given SFC can be derived" to be applicable, SFC OAM packets
    would need to be forwarded the same as live user traffic.

<Nagendra> As it is intending to derive the approximate loss value, I am not sure if we need this additional consideration that the OAM packet would need to follow the live user traffic. Let me know if you think otherwise.

 [636] Is uppercase
    "MUST" applicable to an informational document? Especially given that
    RFC2119/RFC8174 is explicitly referenced by the draft. 

<Nagendra> Based on various reviewer comments, we removed the use of any normative statement.

[666] Add MPLS, TRILL to
    acronyms in 1.2.1 

<Nagendra> Ok. Done.

[678] s/exhaustive/exhaustive./ 

<Nagendra> Done.

[720] Is uppercase "SHOULD" applicable to an informational document?
    Especially given that RFC2119/RFC8174 is explicitly referenced by the draft.
    
<Nagendra> Based on various reviewer comments, we removed the use of any normative statement.

[722] Is uppercase "MAY" applicable to an informational document? Especially
    given that RFC2119/RFC8174 is explicitly referenced by the draft. 

<Nagendra> Based on various reviewer comments, we removed the use of any normative statement.

[754]
    s/packet/packets/ 

[755] s/to next node/to the next node/

 [771] How does this
    requirement align with the earlier paragraph, e.g. in case a node sends an ICMP
    reply? It would probably make sense to scope the statement to e.g. NSH. 

<Nagendra> As mentioned in the statement, the node that initiates the OAM packet must set the marker and so this statement is applicable for the initiating node.

[806]
    s/function/functions/ 

<Nagendra> Done

[809] s/from relevant node/from the relevant node/ 

<Nagendra> Done

[810]
    s/generate ICMP/generate an ICMP/ 

<Nagendra> Done

[812] s/from last/from the last/ 

<Nagendra> Done

[830]
    s/perform continuity/perform the continuity/

<Nagendra> Done

 [834] s/with relevant/with the
    relevant 

<Nagendra> Done

[835] s/perform partial SFC availability./perform a partial SFC
    availability check./ 

<Nagendra> Done

[851] For "In-Situ OAM data fields" add a normative
    reference to draft-ietf-ippm-ioam-data 

[905] Add "CLI" to section 1.2.1
    acronyms 

<Nagendra> Done

[920] Add a reference for NETCONF ->RFC6241
    
<Nagendra> Done

Once again, thanks a lot for the great comments.

Regards,
Nagendra
    
    
    

-- 
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call




[Index of Archives]     [IETF Annoucements]     [IETF]     [IP Storage]     [Yosemite News]     [Linux SCTP]     [Linux Newbies]     [Mhonarc]     [Fedora Users]

  Powered by Linux