Hi David,
Thank you for taking time for the review..
Please find our response inline ##svshah and [Med]
Shitanshu
From: David Black <david.black@xxxxxxx>
Sent: Tuesday, February 21, 2017 3:31 PM To: tsv-art@xxxxxxxx Cc: idr@xxxxxxxx; draft-ietf-idr-sla-exchange.all@xxxxxxxx; ietf@xxxxxxxx Subject: Review of draft-ietf-idr-sla-exchange-10 Reviewer: David Black
Review result: Not Ready I've reviewed this document as part of the transport area directorate's ongoing effort to review key IETF documents. These comments were written primarily for the transport area directors, but are copied to the document's authors for their information and to allow them to address any issues raised. When done at the time of IETF Last Call, the authors should consider this review together with any other last-call comments they receive. Please always CC tsv-art@xxxxxxxx if you reply to or forward this review. Document: draft-ietf-idr-sla-10 Reviewer: David Black Review Date: February 21, 2017 Review result: Not Ready This is an early TSV-ART review of a working group draft, requested by the IDR Working Group. This draft defines an extension to BGP to allow exchange of traffic handling parameters (e.g., configured rates, burst sizes, drop thresholds). While this is a useful area of technology to standardize across network operators, this draft has significant problems, and parts of it could use some serious rethought. This reviewer has discussed small portions of this draft with some of the authors in the past, but this is his first comprehensive reading and review of the draft. Major Issues: [1] The draft is misnamed. This is not an SLA (Service Level Agreement) draft - it's a TCA (Traffic Conditioning Agreement) draft - see the definition of TCA in RFC 2475. This draft should start from that definition and generalize the applicability of TCA beyond Diffserv. Large areas of SLA content are not covered by this draft - for more details, see the Wikipedia article on SLA: https://en.wikipedia.org/wiki/Service-level_agreement . ##svshah, sure, we can change the term to TCA (Traffic Conditioning Agreement), which fairly represents intention in the draft, if there is no otherwise comment from the working group. [2] Section 3.3.2.1's reuse of the TSpec construct from RFC 2115 to specify a token bucket is a good idea, but it's not a good idea to respecify that construct in terms of L2 (link-layer, e.g., Ethernet) octets, as the RFC 2115 TSpec is specified in terms of IP octets. This change to specification in terms of L2 octets results in needing the L2_OVERHEAD TLV to cope with the possible differences in L2 (link) framing overhead at sender and receiver of this information. The TSpec should be respecified at the IP layer, with L2 framing overhead left to the Producer and Consumer to factor into their calculations based on each's direct knowledge of L2 functionality and configuration in the local AS. That ought to enable elimination of the L2_OVERHEAD TLV, thereby reducing complexity.
##svshah, the purpose for advertising L2 overhead, from the Producer to the Consumer, is to make sure traffic is well conditioned, at the egress of the Consumer, to avoid possible indiscriminate drops at the ingress of the Producer. In another words, the Producer
is telling the Consumer to ignore its own link level overhead, instead use Producer's provided link level overhead while running frames through QoS functions (this is relevant in use-cases where l2 overhead of the egress link on Consumer and of ingress link
on the Producer are of different size, e.g., in the vpn/tunnel connection between two peer nodes which physically may be multiple hops away).
I understand your comments regarding re-using TSpec without any modification. Do you agree with the use-case of l2 differences? If yes, any suggestion how to accommodate that?
[3] A token bucket should have two rate-related marking parameters based on its token fill rate, i.e., min-rate, not the four rate-related marking parameters in this draft. The max-rate parameters in sections 3.3.2.5-6 ought to be specified against a second token bucket. In addition, the handling precedence algorithm in section 3.3.2.7 is an overly complex way to specify the relationship of two token buckets. All of this is even more important, because max-rate, as defined in RFC 2115, is only applicable to bursting - that max-rate for bursting often turns out to be an interface line rate, which is not generally useful for the traffic provisioning purposes of this draft.
##svshah, what you describe below conceptually very much makes sense and that is what we attempt to achieve. What is unclear though how to capture that using TSpec definition specified in RFC2215. Since that TSpec definition has both minimum-rate and maximum-rate,
but no in/out profile marking parameters.
Is your proposal below to define two TSpecs using the same definition from RFC2215? Wouldn't in that case we have min/max specified twice? min/max in each TSpec? and thus confusing to represent one min and one max through two TSpecs?
The following should be done instead: - Define TSpecs for two token buckets, a primary/committed token bucket and a secondary/peak token bucket that MUST be nested, i.e., traffic that is in-profile for the secondary/peak token bucket is always in-profile for the primary/committed token bucket. Some of the details of how to specify this are subtle, see RFC 2698 for a worked example. Use of a secondary/peak token bucket requires use of the primary/ committed token bucket, but a primary/committed token bucket can be used without a secondary/peak token bucket. - For a single token bucket, define two handling TLVs, Committed (in profile) and Excess (out of profile). - For two token buckets, define three handling TLVs, Committed (in profile for both token buckets), Peak (out of profile for primary/committed token bucket, but in profile for secondary/peak token bucket) and Excess (out of profile for both token buckets). NB: Could use Green/Yellow/Red terms instead of Committed/Peak/Excess terms. [4] The drop threshold TLV in section 3.3.2.8 is not specified sufficiently to be implemented interoperably. For example, I don't understand what an implementation is supposed to do when it receives 3 drop thresholds.
##svshah, It is around the semantics of a single queue with a different threshold for a set of code-points, where packet for a specific code-point is to be tail dropped if overall queue-depth hits code-point specific threshold at the arrival of that packet.
We will add appropriate clarification for this semantics.
[5] The relative priority TLV in section 3.3.2.9 has the same insufficient specification problem as the drop threshold TLV, compounded by a functional incompleteness problem - if the recipient is using a weighted packet transmission scheduler (e.g., WRR), priorities cannot be used to configure that scheduler. Hence, some specification of weights and scheduling algorithms that use weights needs to be added.
##svshah, Sure. Is following clarification okay? (some of the wordings taken from RFC2598)
"
A higher priority class of traffic to be served without pre-empted by lower priority class of traffic for more than a packet time at the configured rate.
In the system that implements WRR, the use of relative priority may get restricted where a single queue may be used for a higher priority traffic class where that queue is configured for the full share of the output bandwidth.
"
[6] I have no idea what the sub-traffic classes TLV in section 3.3.2.10 is supposed to do, as that TLV is specified based on "Traffic Class TLVs" which is an undefined term in this draft (e.g., that term is not used outside of section 3.3.2.10.
##svshah, They are to facilitate hierarchy. A specific Traffic Class, can further be divided in a multiple subset of Traffic Classes, with their own Traffic Class Elements and Traffic Class Services.
Will correct the wording to remove your this specific concern.
[7] This draft's QoS contents need to be functionally aligned with work-in-progress on YANG QoS models, in order to provide some assurance that that this draft is implementable for actual network switch/router data paths. The current acknowledgement of the existence of YANG, NETCONF and RESTConf at the end of Section 1 does not suffice.
##svshah, While eventually "Traffic Conditioning Agreement" should be translated to the actual forwarding qos policy on any vendor specific device, TCA exchange largely carries concepts/semantics that is either standard based or well understood.
While we are not aware of any working group document of QoS Yang Model, we think that concepts of Traffic Conditioning should be easily adaptable to any QoS Yang Models since they also have to be defined to support those concepts.
[8] There are significant complexity and correctness problems caused by the option to not specify the Source AS - e.g., Section 3.2 defines an SLA ID as an "identifier which is unique in the scope of Source AS" which is meaningless if there is no Source AS. It would be simpler to always specify Source AS, even in the point-to-point case. ##svshah, okay. Ron Bonica also suggested same in his review earlier. We had chosen a path of optional Source AS to simplify implementations in cases. Given specification of Source AS always is more clearer in multiple feed-back we have gotten so far, we can modify the specification to incorporate that over the simplicity of implementation for certain cases. [9] In section 3.2, the "intended for the peer receiver of the BGP UPDATE message" text in the specification of bit 0 of the SLA Subtype flags is unclear. I suspect that this is intended to differentiate the two usages described in sections 4.1.1 (Point-to-Point) and 4.1.2 (Multiple Hops), in which case the parenthesized terms (or similar terminology) should be used with cross-references to those two sections.
##svshah, sure will make necessary changes, also in the context of earlier comment.
[10] There needs to be a coherent discussion in one place about how SLA advertisement, update and withdrawal work. A single ADVERTISE method may suffice on the wire, but the details on how initial advertisement, subsequent advertisement (update) and withdrawal work need to be specified in one place. The third paragraph of Section 4 is a start on this material, but it's too terse; [Med] That text is terse because we are reusing the BGP machinery for advertising/withdrawing; we only augment it with
triggers that are linked to the SLA information. A new SLA will lead to an update message with ADVERTISE, an update of an existing SLA will trigger an update message. We do think this information is already present in the document.
it should be expanded
into its own subsection, and moved earlier to come before the ADVERTISE method in Section 3.2. This text from Section 3.2 should be moved into that new subsection and likewise expanded: [Med] Another option is to move it Section 4. From our standpoint, we do think it is straightforward to describe the behavior once the attributes format are defined. one, for the same prefix and from the same Source AS, indicates Source AS is advertising new SLA Content to replace the previous one advertised with the same SLA ID. In addition, I wonder whether functionality should be added to allow withdrawal of an advertisement by specifying its SLA ID, although that was not part of the original design. [Med] The reason we didn’t adopted that design is that we don’t want to make assumptions on which data the remote peer will be used for enforcing local actions and also because of this text: The SLA ID applies to aggregate traffic to prefixes for a given AFI/SAFI that share the same Source AS and SLA ID. [11] Notions of context for interpretation of all the IPFIX parameters in 3.3.1 need to be added, e.g.: - The first three parameters (DSCP, MPLS EXP field in top label, 802.1q priority) can and do vary on a link-by-link or LSP-by-LSP basis along a traffic's network path. - The IP address parameters are rather likely to be VPN-specific when there's more than one BGP/MPLS VPN that spans or transits the ASs involved. - The transport port parameters need specification of which transport header and where it is located (e.g., for TCP traffic carried by in VXLAN, is this the inner TCP header or the outer UDP header in VXLAN). There are probably simple approaches to specifying context in all cases, but that context does need to be specified ... in all cases. [Med] We dind’t include a context field because we thought that the definition of the IPFIX attributes is sufficient by itself, but if you do think it is helpful to have such information, we can update the table. [12] The security considerations (section 10) are severely incomplete and insufficient: - Discussion of possible abuse of this BGP option for denial-of-service and theft-of-service, needs to be added, including possible countermeasures and mitigations. [Med] This attack vector is not specific to this attribute; this is valid for BGP in general. - This sentence at the end of the second paragraph in Section 10 is content-free: It is NOT RECOMMENDED to enable this attribute at the scale of the Internet unless if means to prevent leaking sensitive information are enforced. What exactly is an implementer or admin supposed to do? [Med] An example of what an admin needs to check if this information can be sent to a given peer or not. Some of the
content of the attribute contains some sensitive information such as contract Id, classes, etc.
How does
one figure out whether an implementation or deployment is "at the scale of the Internet" ?? [Med] The idea here is to avoid leaking such data in to the global routing table. Only entitled peers need receive
such information with controlled scope.
- The next to last paragraph in section 10 is almost content-free, as
it leaves decisions on implementation and deployment of key security functionality as "an exercise for the reader" - that's not acceptable. [Med] I guess you are referring to the following text. Validation checks are really deployment-specific. The text calls out the issue and recommends an action; how to translate this action into detailed actions is realty local to a domain The attribute may be advertised by a misbehaving node to communicate
SLA parameters that are not aligned with the SLA agreements. Though
the enforcement of SLA parameters is outside the scope of this
document, it is RECOMMENDED that the SLA Consumer to enforce a set of
validation checks before translating the SLA parameters conveyed in
the QoS attributes into provisioning actions. Such validations MAY
rely on SLA parameters like the origin AS or SLA ID, like generating
SLA ID using pseudo-random schemes [RFC4086]. - Last, but not least, the final paragraph in Section 10 is a joke that will be lost on a security directorate reviewer - to understand why, see Section 2 of RFC 6919, and take note of the publication date of RFC 6919. ------------------------ Having noted a dozen major issues, I'll end the review here for now, as I believe a serious revision of the draft is called for, which would be a better starting point to review for minor issues and editorial items. -------------------------------------------------------- David L. Black, Distinguished Engineer Dell EMC, 176 South St., Hopkinton, MA 01748 +1 (508) 293-7953 Cell: +1 (978) 394-7754 David.Black@xxxxxxxx <=== NEW === -------------------------------------------------------- |