Got it. We will make it explicit that what's included in the table are by no means exhaustive rather than some examples. Thanks! Haoyu -----Original Message----- From: Scharf, Michael <Michael.Scharf@xxxxxxxxxxxxxxx> Sent: Monday, November 1, 2021 4:21 PM To: Haoyu Song <haoyu.song@xxxxxxxxxxxxx>; tsv-art@xxxxxxxx Cc: draft-ietf-opsawg-ntf.all@xxxxxxxx; last-call@xxxxxxxx; opsawg@xxxxxxxx Subject: RE: Tsvart last call review of draft-ietf-opsawg-ntf-09 > -----Original Message----- > From: Haoyu Song <haoyu.song@xxxxxxxxxxxxx> > Sent: Monday, November 1, 2021 9:22 PM > To: Scharf, Michael <Michael.Scharf@xxxxxxxxxxxxxxx>; tsv-art@xxxxxxxx > Cc: draft-ietf-opsawg-ntf.all@xxxxxxxx; last-call@xxxxxxxx; opsawg@xxxxxxxx > Subject: RE: Tsvart last call review of draft-ietf-opsawg-ntf-09 > > Hi Michael, > > Thank you very much for the review! > According to your suggestion, we explicitly list the congestion avoidance as > a > requirement at each plane and add RFC8085 as BCP reference. If a reference for network circuit breakers is needed in addition to RFC 8085, the other reference would be RFC 8084. I just mention this because I have used the term "circuit breakers", but forgot to mention that there is a BCP fort hat, too. > We also take your suggestions on the precise terms used in the table. > I have just one question: While IPFIX can run over TCP/UDP/SCTP, for > forwarding plane, we recommend to used it over UDP only for simplicity. Is > this > acceptable? According to RFC 7011, SCTP "MUST be implemented by all compliant implementations." As a result, such a recommendation might be inconsistent with RFC 7011. I am not an IPFIX expert and I am not sure if RFC 7011 is indeed fully aligned with running code. But if the document mentions IPFIX and lists *only* UDP as corresponding transport protocol, that would require at least some explanation in the text, because clearly IPFIX could be implemented over transport protocols other than UDP as well. Note that there may be other simple solutions to work around that issue, e.g., by using in the table row a label such as "example data transport" or the like, which would make implicitly clear that options other than UDP could exist. As long as there is some reasonable explanation of how to read the table, I'll be fine. Michael > I'll upload a new version of the document as soon as the submission website > is > reopened. Thanks! > > Best regards, > Haoyu > > -----Original Message----- > From: Michael Scharf via Datatracker <noreply@xxxxxxxx> > Sent: Sunday, October 31, 2021 4:24 PM > To: tsv-art@xxxxxxxx > Cc: draft-ietf-opsawg-ntf.all@xxxxxxxx; last-call@xxxxxxxx; opsawg@xxxxxxxx > Subject: Tsvart last call review of draft-ietf-opsawg-ntf-09 > > Reviewer: Michael Scharf > Review result: Ready with Issues > > This document has been reviewed as part of the transport area review team's > ongoing effort to review key IETF documents. These comments were written > primarily for the transport area directors, but are copied to the document's > authors and WG to allow them to address any issues raised and also to the > IETF > discussion list for information. > > When done at the time of IETF Last Call, the authors should consider this > review > as part of the last-call comments they receive. Please always CC tsv- > art@xxxxxxxx if you reply to or forward this review. > > This informational document describes an architectural framework for network > telemetry and the main components of corresponding systems. > > It has two issues related to TSV topics: > > First, the document lacks a discussion of the importance of congestion > control > for telemetry traffic as well as corresponding references, e.g., to RFC > 8085. > High-volume telemetry traffic can overload a network unless proper counter- > measures are in place (i.e., at minimum "circuit breakers"). It doesn't seem > appropriate to entirely ignore that issue. > > Second, language regarding the ambigous term "transport" and the references > to Internet transport protocols must be improved to be consistent with IETF > standards. > > Below are some examples for sections in which these issues are obvious. > > Section 3.4 > > It is worth noting that a network telemetry system should not be > intrusive to normal network operations by avoiding the pitfall of the > "observer effect". That is, it should not change the network > behavior and affect the forwarding performance. Otherwise, the whole > purpose of network telemetry is compromised. > > => This statement should be extended to be very explicit about the risk of > causing network congestion by high-volume telemetry traffic unless proper > isolation or traffic engineering techniques are in place, or congestion > control > mechanisms ensure that telemetry traffic backs off if it exceeds the network > capacity. RFC 8085 is a relevant BCP in this space. As a side note, RFC 8085 > discusses other relevant challenges as well, but the issues caused by > potentially > inelastic high-volume telemetry traffic seem particularly relevant for > ensuring > network stability when telemetry solutions get deployed. > > 4.1. Top Level Modules > > +---------+--------------+--------------+---------------+-----------+ > | Module | Management | Control | Forwarding | External | > | | Plane | Plane | Plane | Data | > +---------+--------------+--------------+---------------+-----------+ > |Object | config. & | control | flow & packet | terminal, | > | | operation | protocol & | QoS, traffic | social & | > | | state | signaling, | stat., buffer | environ- | > | | | RIB | & queue stat.,| mental | > | | | | ACL, FIB | | > +---------+--------------+--------------+---------------+-----------+ > |Export | main control | main control | fwding chip | various | > |Location | CPU | CPU, | or linecard | | > | | | linecard CPU | CPU; main | | > | | | or forwarding| control CPU | | > | | | chip | unlikely | | > +---------+--------------+--------------+---------------+-----------+ > |Data | YANG, MIB, | YANG, | template, | YANG, | > |Model | syslog | custom | YANG, | custom | > | | | | custom | | > +---------+--------------+--------------+---------------+-----------+ > |Data | GPB, JSON, | GPB, JSON, | plain | GPB, JSON | > |Encoding | XML | XML, plain | | XML, plain| > +---------+--------------+--------------+---------------+-----------+ > |Protocol | gRPC,NETCONF,| gRPC,NETCONF,| IPFIX, mirror,| gRPC | > | | | IPFIX, mirror| gRPC, NETFLOW | | > +---------+--------------+--------------+---------------+-----------+ > |Transport| HTTP, TCP | HTTP, TCP, | UDP | HTTP,TCP | > | | | UDP | | UDP | > +---------+--------------+--------------+---------------+-----------+ > > => This table needs to be corrected. > > 1/ At least the entry in the column "forwarding plane" for IPFIX seems > incorrect, > as the IETF has standardized IPFIX use over TCP, UDP and also SCTP. > > HS>> Yes, IPFIX can run over TCP/UDP/SCTP, but for forwarding plane, we > recommend to used it over UDP only for simplicity. Is that okay? > > 2/ The label "transport" in the last line should be replaced by an other > term > (maybe "data transport"?). In the TCP/IP protocol stack, "HTTP" is not a > transport but an application protocol, unlike TCP and UDP. As a result, the > line > headline should use a term that cannot be confused with the name of a layer > in > the TCP/IP protocol stack. > > > 3/ The label "protocol" in the second but last line is also misleading. All > entries in > the "transport" line are protocols as well. The term "Application protocol" > may > be one option; others may exist as well. > > > 4.1.1. Management Plane Telemetry > > * High Speed Data Transport: In order to keep up with the velocity > of information, a server needs to be able to send large amounts of > data at high frequency. Compact encoding formats or data > compression schemes are needed to reduce the quantity of data and > improve the data transport efficiency. The subscription mode, by > replacing the query mode, reduces the interactions between clients > and servers and helps to improve the server's efficiency. > > => The server is not the only bottleneck. This section needs to discuss the > network as a potential bottleneck as well, and explain that a telemetry > solution > must protect the network from congestion by congestion control mechanisms > or at least circuit breakers. RFC 8085 is a relevant BCP in this space. > > 4.1.2. Control Plane Telemetry > > => Discussion of the risk of congestion by telemetry protocols without > congestion control (e.g., using UDP possibly without circuit breakers) is > missing > in this section. > > 4.1.3. Forwarding Plane Telemetry > > * The data plane devices must provide timely data with the minimum > possible delay. Long processing, transport, storage, and analysis > delay can impact the effectiveness of the control loop and even > render the data useless. > > => Similar like in the previous section, this wording entirely ignores the > impact of > potential network capacity shortage and congestion. A reference to RFC 8085 > and a corresponding discussion of how to meet the requirements from RFC 8085 > is missing. > > 4.1.4. External Data Telemetry > > => As the communication with "external" entites outside the boundary of a > provider network may be realized over the Internet, the risk of congestion > as > well as proper counter-measures is even more relevant in this section as > compared to the previous sections. > -- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call