There's a ticket in the system for this, kindly inserted on my behalf: https://trac.tools.ietf.org/tools/ietfdb/ticket/2078 Regards Brian On 07/12/2016 07:18, Donald Eastlake wrote: > So, are we heading for a situation where all Directorate review will > come through in a way that you can't tell what directorate it is or > who did the review without opening the mail? And that just doing a > reply-all will uselessly send the response to the Secretariat? > > Seems to me that generally in the past such messages were from the > reviewer and the Directorate name appeared in the subject and > reply-all usually did the right thing... > > Thanks, > Donald > =============================== > Donald E. Eastlake 3rd +1-508-333-2270 (cell) > 155 Beaver Street, Milford, MA 01757 USA > d3e3e3@xxxxxxxxx > > > > ---------- Forwarded message ---------- > From: IETF Secretariat <ietf-secretariat-reply@xxxxxxxx> > Date: Tue, Dec 6, 2016 at 11:27 AM > Subject: Review of draft-ietf-pals-endpoint-fast-protection-04 > To: "General Area Review Team (Gen-ART)" <gen-art@xxxxxxxx> > Cc: draft-ietf-pals-endpoint-fast-protection.all@xxxxxxxx, > ietf@xxxxxxxx, pals@xxxxxxxx > > > Reviewer: Dale Worley > Review result: Ready with Nits > > I am the assigned Gen-ART reviewer for this draft. The General Area > Review Team (Gen-ART) reviews all IETF documents being processed > by the IESG for the IETF Chair. Please treat these comments just > like any other last call comments. > > For more information, please see the FAQ at > <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>. > > Document: draft-ietf-pals-endpoint-fast-protection-04 > Reviewer: Dale R. Worley > Review Date: 2016-12-06 > IETF LC End Date: 2016-12-06 > IESG Telechat date: (not set) > > Before reviewing this document, I knew nothing about pseudowire > routing, so my review does not fully assess the technical aspects of > the document. I suspect that almost all of these items are editorial. > I hope that they are simply places where the description can be made > clearer to the naive reader by making it more exact. Some of them may > be places where I don't fully understand the technology. It's > possible that some of them are technical issues. > > Summary: > > This draft is basically ready for publication, but has nits that > should be fixed before publication. > > These items seem to have technical content: > > In section 4.2 and 4.3, it is not clear to me whether all of the PWs > being carried by a particular tunnel must have the same protector. > Section 4.2 says that it is so, but section 4.3 suggests that the PWs > can be divided into subsets which have different protectors, without > mentioning any correspondence between the subsets and the > tunnel-groups of PWs. > > What is the mechanics of context identifiers in the context of > protecting egress ACs? It seems like this should all be analogous to > the described cases (protecting PEs), but as this is a specification, > how the analogy works should be made explicit. > > In section 6.4, does there need to be a specification for the > "Reserved" fields of the Protection FEC Element TLV? Usually, there > is a specification "must be sent with zeros/must be ignored on > receipt". (Or is there a global understanding of this in LDP?) > > In section 6.4, there is a definition of "PW Information". Are there > previous definitions of "PW information" in PW/LDP/MPLS usage? It > looks a lot like what's defined in RFC 4447 sections 5.2 and 5.3. If > they are semantically the same, the same format should be used and > simply referred to here. If the format is different because they > aren't semantically the same ... perhaps there should be a note > explaining how/why. > > Nits/editorial comments: > > There are quite a number of places where "a" or "the" seems to be > omitted before nouns. I'll let the RFC Editor identify/correct those. > > There is a general issue regarding what aspects of local repair are > configured by some external means (e.g., the protectors that PLRs use, > the context identifiers) and what aspects are established > automagically by the defined mechanisms (the bypass tunnels). The > document would have been clearer to me if these were separated > explicitly, but I suspect that it is common usage in routing > specifications for the reader to sort it out. > > Abstract > > IMO it would help if the Abstract mentioned that only IP/MPLS > transport tunnels are protected. I say this because the only part of > this technology that I'd previously heard of was MPLS; such a > specification in the abstract would tell a large body of potential > readers that the document is *not* relevant to their situation. > > 1. Introduction > > The mechanism is applicable to LDP signaled PWs. It is relevant to > networks with redundant PWs and multi-homed CEs. It is designed on > the basis of MPLS upstream label assignment and context-specific > label switching [RFC5331]. > > It might be more informative to say "Fast failure protection for > pseudowire endpoints" rather than "The mechanism". > > Which of the factors listed are prerequisites for the use of this > mechanism and which are factors which this mechanism additionally > supports? > > Whichever of these are prerequisites for the solution should be > mentioned as such in the Abstract, including particularly that the PW > must be carried by IP/MPLS transport tunnels (as described in section > 4.1 paragraph 2). > > Fast protection refers to its ability to > restore traffic in the order of tens of milliseconds. Compared > with > global repair and control plane repair, this mechanism can provide > faster service restoration. > > What is the time scale of "global repair and control plane repair"? > Given that you give the time scale of "fast protection", it would be > informative to have comparable values for other repair techniques. > > However, it is intended to complement > those mechanisms, rather than replacing them. > > It might be useful to explain why global/control plane repair is still > needed. (Then again, that might be obvious to anyone in the field.) > > 3.1. Single-Segment PW > > For either mode, when considering the traffic flowing in a given > direction over an active path, this document views the ACs, PEs and > PWs to serve primary or backup roles. In particular, the ACs, PEs > and PW along this active path are primary, while those along the > other path are backup. Note that in the active-active mode, the > backup path is an active path by itself, carrying its own share of > traffic while protecting the other active path. > > The wording here doesn't seem to be quite right. I think you need to > phrase it more like this: > > For either mode, when considering the traffic flowing in a given > direction over an active path, this document views the ACs, PEs and > PWs to serve primary or backup roles or both. In particular, given > an active path, the ACs, PEs > and PW along that active path have primary roles, while those along > the > other path have backup roles. Note that in the active-active mode, > each AC, PE, and PW has a primary role (due to being on an active > path) and also a backup role protecting the other path (which is > also active). > > -- > > For clarity, primary egress AC, primary egress PE, backup egress > AC, > and backup egress PE may simply be referred to as primary AC, > primary > PE, backup AC, and backup PE, respectively, when the context of a > discussion is egress endpoint. > > This is correct, but it seems to overlook the use of "primary PE" to > mean "the PE that is being protected (when we are considering a > particular PLR and protector)". That usage is common in the rest of > the document, but difficult for the naive reader to abstract from the > above text. > > 4.1. Applicability > > In a network where > transport tunnels may provide ECMP to primary PEs, care should be > taken to prevent misordered packet delivery during local repair. > > Should "ECMP" get a reference? Or is it so common that any reader of > this document can be assumed to know already? > > In a network where > transport tunnels may provide ECMP to primary PEs, care should be > taken to prevent misordered packet delivery during local repair. > > Perhaps this should be qualified with "if the PW or some flows > within the PW are sensitive to packet misordering", as is mentioned > later in the paragraph. > > Naively, it seems that with ECMP it is impossible to prevent packet > reordering. (E.g., it's hard to imagine using ECMP with ATM circuits > unless the destination performed SAR separately for each path.) > However, the scenario developed in the paragraph could cause a > dramatic increase in the "distance" that packets are reordered, and > that increase could be a problem. Perhaps the text could emphasize > that the problem isn't so much the absolute presence of reordering but > its magnitude. > > 4.2. Local Repair and Protector > > I find this title peculiarly hard to parse. I think the problem is > the ambiguity whether "local" modifies "protector" or not, as well as > the fact that "local repair" is a concept whereas "protector" is a > singular device. Perhaps "Local Repair and Protectors" ("protectors" > is a class of devices, with similar abstractness as "local repair") or > "Local Repair, Bypass Tunnels, and Protectors" (since bypass tunnels > seem to be a similarly critical part of the solution). > > In anticipation of the failure, the PLR MUST > also pre-establish a bypass tunnel to a "protector", and > pre-install > a bypass route in the data plane. > > It might be clearer to say "a bypass route for the bypass tunnel in > the data plane". > > It would be useful to mention here that the protector is either a > backup PE which, in a sense, functions as an alternative terminus for > the PW segment, or a router which will forward traffic to such a > backup PE. > > This raises the point that, in a way, the bypass tunnel is an > alternative continuation of the protected transport tunnel. E.g., it > has as the destination endpoint the same virtual interface address > (the context identifier). Or does the PLR, when repairing, act as a > proper downstream PE for the primary PW segment, and a proper upstream > PE for the PW segment in the bypass tunnel? (I'm likely not using the > terminology correctly here, but I suspect that the terminology is not > used in a strictly correct manner in section 4.2.) > > Upon > detecting the failure, the PLR invokes the bypass route in the data > plane, and reroutes PW traffic to the protector through the bypass > tunnel. > > Does "invokes the bypass route" have a clear meaning, i.e., does this > use of "invoke" have a proper definition? I think you can elide that > phrase and just say "the PLR reroutes PW traffic to the protector > ...", with better clarity. > > In this document, the PLR > simply computes and establishes a node protection bypass tunnel in > the same fashion as the normal IP/MPLS node protection, except that > with the notion of context identifier, the bypass tunnel will be > established from the PLR to the protector (Section 4.6). > > Is there a reference for "the normal IP/MPLS node protection"? > Without the context identifier, what would happen which contrasts to > "from the PLR to the protector"? > > On the other > hand, this imposes a requirement on the protector that it MUST be > able to forward the packets based on a PW label that is assigned by > the primary PE, and ensure that the traffic MUST eventually reach > the > target CE. > > More strictly, the protector must forward the packets *along the > backup path*. Of course, that's implied by the rest of the document, > but it might be worth including that fact in this statement. It also > implies that all of the PEs along the backup path must be able to > forward the packets based on the PW label of the primary path. That's > also implied, but I don't think it's stated anywhere. > > 4.3. Context Identifier > > Likewise, the PWs terminated on a primary PE may be protected by > multiple protectors, each for a subset of the PWs. > > I think this is actually "PW segments" rather than "PWs" (since a PW > only terminates at the egress PE). There seems to be a general issue > in the document of using "PW" in places where the strictly correct > term > is "PW segment". > > But compare with section 4.2, "This requires that the protector given > by the bypass tunnel MUST be intended for all the PWs carried by the > [primary] transport tunnel." That means that all PWs in one tunnel > must be within one of these subsets of PWs. Is that really correct? > > It seems to me that the context identifier is used as the address of > the virtual interface that is downstream end of the protected tunnel. > If so, it would have helped me make a mental model of the process by > stating that here in section 4.3 -- 4.3 states that a context > identifier is an address, but does not specify what it is the address > of. > > What is the mechanics of context identifiers in the context of > protecting egress ACs? Since there is a bypass tunnel, there must be > an address of the virtual interface within the protector that is the > terminus of the bypass tunnel. It seems like this address should be > the context identifier used by the PLR, but in this situation there is > no downstream "primary PE" from which to make a {primary PE, > protector} pair. I think that what works is to use the destination CE > in place of the primary PE. That's straightforward, since the pairs > are never directly represented in the protocol, so most of the > mechanics should be unchanged. In this situation, the PLR can't learn > the context identifier from the primary PE, but I suppose it can > invent/be configured with the context identifier itself, since it > doesn't need to establish a tunnel to the primary PE. It seems like > this should all be analogous to the described cases, but as this is a > specification, how the analogy works should be made explicit. > > 4.3.1. Semantics > > A distinct context identifier MUST be assigned to > the primary PE and each protector. > > This sentence is not correct as written since it states that context > identifiers are assigned to primary PEs and also are assigned to > protectors. I think you meant to say > > A distinct context identifier MUST be assigned to > each pair of primary PE and protector that it uses. > > or > > A distinct context identifier MUST be assigned to > each {primary PE, protector} pair. > > 4.3.2. FEC > > In an MPLS network, a context identifier represents a FEC > (Forwarding > Equivalence Class) for transport tunnels and bypass tunnels > destined > for it. > > This doesn't seem to match the definition of "context identifier" > given previously, as a single context identifier can be used by > several PW segments ending at a PE, which PWs carry packets of > different FECs. Is this perhaps "context label"? Or are there > multiple meanings of "context identifier" which should be explicitly > disambiguated? Or am I misreading this? > > 4.5. Transport Tunnel > > In egress PE node protection and S-PE node protection, > when a node failure is detected on any ECMP branch, the penultimate > hop router SHOULD act as a PLR to reroute all the traffic of the > ECMP > set to the protector. > > I think "node failure" should be "link failure" here -- if there is > *node* failure (of the PE), then all of the ECMP paths will not work, > and the PLR would have to reroute all the traffic anyway. > > 4.6. Bypass Tunnel > > A bypass tunnel SHOULD NOT need to be further protected against a > transit link failure, transit node failure, or egress node failure. > > I think this is an incorrect use of "SHOULD NOT", as it is not a > qualifier of a requirement on a device. I think that you mean > > A bypass tunnel SHOULD NOT be protected against a > transit link failure, transit node failure, or egress node failure. > > But you might want to explain it with > > There is little or no benefit from protecting a bypass tunnel, so > a bypass tunnel SHOULD NOT ... > > 4.7. Examples of Forwarding State > > In the forwarding plane, it is indicated by > the bypass tunnel(s) destined for the context identifier. > > What is "it"? I think it refers to "each label space", but in that > case I think "they" would be the preferred usage. Better, replace > "it" with a noun phrase. > > 5. Revertive Behavior > > Possible triggers of > global repair include PW status notification, VCCV, BFD, end-to- > end OAM between CEs, etc. > > I see that "VCCV" is the topic of RFC 5085, but it is not the name of > a page in Wikipedia, which suggests that giving a reference for it > would be useful. > > Particularly in the > case of egress PE and S-PE node failures, if the ingress PE or the > protector loses communication with the (S-)PE for an extensive > period > of time, LDP session may go down. > > "the (S-)PE" isn't unique in this context, as there is an ingress PE > in the discourse. I think you mean "the primary (S-)PE" (i.e., the > protected PE). > > 6. LDP Extensions > > To facilitate the procedures, > > Usually "facilitate" only applies to making easier something that > could be done already. In this case, the new TLV is necessary to > enable these procedures, so I suggest s/facilitate/support/. > > The procedures in this section are only applicable, if the > protector > advertises the Egress Protection Capability TLV, the primary PE > supports the advertisement of the Protection FEC Element TLV, and > in > the centralized protector model, the backup PE also supports the > advertisement of the Protection FEC Element TLV. > > My understanding is that "the procedures in this section" means > "establishing endpoint fast failure protection via LDP". That seems > to be sufficiently important that it is probably worth expanding this > sentence into a bullet list: > > The procedures in this section are only applicable if: > > o the protector advertises the Egress Protection Capability > TLV, > > o the primary PE supports the advertisement of the Protection > FEC Element TLV, and > > o in the centralized protector model, the backup PE also > supports the advertisement of the Protection FEC Element TLV. > > Subtly, this seems to be the list of preconditions under which the PLR > can establish the needed tunnels and perform the local repair -- this > list does not list that the PLR must implement the needed behaviors, > which is obviously a precondition of local repair. Or perhaps it is a > list of LDP capabilities that are required to set up local repair via > LDP. In either case, it suggests the first clause could be revised > to change "the procedures ..." with a phrase that is more specific as > to what, precisely, is enabled by the conjunction of these three > items. > > 6.1. Egress Protection Capability TLV > > The TLV Code Point is TBD. It needs to be assigned by IANA. > > It seems like this should be flagged with a note so the RFC Editor can > put in the assigned value. > > The "Capability Data" is encoded with the context identifier of the > {primary PE, protector}. > > This should be something like: > > The "Capability Data" is encoded with the context identifiers for > the {primary PE, protector} pairs for which the advertiser is the > protector. > > 6.4. Protection FEC Element TLV > > Does there need to be a specification for the "Reserved" fields? > Usually, there is a specification "must be sent with zeros/must be > ignored on receipt". (Or is there a global specification of this in > LDP?) > > ~ PW Information ~ > > Are there previous definitions of "PW information" in PW/LDP/MPLS > usage? It looks a lot like what's defined in RFC 4447 sections 5.2 > and 5.3. If they are semantically the same, the same format should be > used and simply referred to here. If they aren't semantically the > same ... perhaps there should be a note explaining why. > > 7. IANA Considerations > > The assignment for the Egress Protection Capability TLV should be > described more definitively. My impression is that it should be > assigned out of the "TLV Type Name Space" in the LDP parameters page, > but that doesn't seem to be stated explicitly. Also, there seem to be > three blocks of values (0x0001-0x07FF, 0x0800-0x08FF, 0x0900-0x3DFF) > that are all marked as "IETF consensus", but which presumably differ > in some manner. Which block should the value be in? > > Perhaps for clarity, an appropriate skeleton protocol assignment table > row should be shown. > > Value Hex Name Label Advertisement Discipline > ------------------------------------------------------------------- > 131 0x83 Protection FEC Element DU > > Section 6.2 calls the new TLV in the Label Mapping message "a > Protection FEC Element TLV", but section 7 calls it an "LDP Protection > FEC Type Name Space value". The latter phrase consists of 7 > successive nouns and is (IMO) unparsable by a reader who doesn't > already know what it means. I suggest changing it to "the Protection > FEC Element value in the LDP FEC Type Name Space" (which aligns with > the titles used in the IANA protocol assignment web page). > > It appears that the values in the FEC Type Name Space are 8 bits but > one place in section 7 gives the value as "0x083", which should > presumably be changed to "0x83". > > 8. Security Considerations > > In all scenarios, the role of > protector is entirely managed by network operator, and backup PEs > can > be used anyway to host PWs and LDP sessions. > > Perhaps "anyway" could be clarified as: > > ... and, regardless of local repair, backup PEs can > be used to host PWs and LDP sessions. > > -- > > In general, [RFC5920] describes the security framework for MPLS > networks. [RFC3209] describes the security considerations for RSVP > LSPs. [RFC5036] describes the security considerations for the base > LDP specification. [RFC5561] describes the security considerations > which apply when using the LDP capability mechanism. All these > security framework and considerations apply to this document as > well. > > Is there a reference for security considerations for "IP tunnels" (as > mentioned in section 4.3.1 and 4.6)? > > Dale > >