Hi Dale, Thank you for the detailed review of the document. Please see the updated document and diff files attached in this email that addresses your comments. Have addressed your comments as following <RG> …. On 2017-01-12, 4:29 PM, "Dale Worley" <worley@xxxxxxxxxxx> wrote: Reviewer: Dale Worley Review result: Ready with Nits I am the assigned Gen-ART reviewer for this draft. The General Area Review Team (Gen-ART) reviews all IETF documents being processed by the IESG for the IETF Chair. Please treat these comments just like any other last call comments. For more information, please see the FAQ at <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>. Document: draft-ietf-teas-gmpls-resource-sharing-proc-06 Reviewer: Dale R. Worley Review Date: 12 Jan 2017 IETF LC End Date: 17 Jan 2017 IESG Telechat date: 2 Feb 2017 Summary: This draft is basically ready for publication, but has nits that should be fixed before publication. There are various places where the wording of the draft is unclear. The draft would benefit from a careful editing for clarity. Particularly, there are a considerable number of places where the use of "the" and "a" and of plurals is not standard or leaves the text somewhat uncertain. There are various references to ASSOCIATION objects, SESSION_ATTRIBUTE objects, etc. The text leaves it unclear where these objects live; it talks as if they exist in an abstract sense. I think I managed to track down what is going on in RFC 4872, which is that the Path message that sets up an LSP contains an array of objects and all of the objects described are parts of the respective LSP setup Path messages. I also suspect that the Path message objects are retained by the various nodes as permanent information about the LSPs that they have configured, so one can speak unambiguously of "the ASSOCIATION object of the LSP" long after the LSP is set up. If all of this is correct, it would help the naive reader if this was spelled out at the beginning of the document and/or the wording was changed in places provide this context. E.g., GMPLS LSPs can share resources during LSP setup if they have Shared Explicit (SE) flag set in their SESSION_ATTRIBUTE objects and: could be clarified as GMPLS LSPs can share resources during LSP setup if they have Shared Explicit (SE) flag set in the SESSION_ATTRIBUTE objects in the Path messages that create them and: <RG> Edited the document to clarify (at multiple places by using suggested text above). There are a number of terms that are unclear to me. It's possible that they have very standard meanings in GMPLS or traffic engineering, though. Is there a terminology section in a referenced RFC that could be pointed to to define these various words? <RG> Added Section 2. [RFC4427] defines terminology for the GMPLS recovery (protection and restoration). 1. Introduction to setup Label Switched Paths (LSPs) in non-packet transport The form "set up" is a verb, whereas "setup" is a noun (naming an instance of the action of setting up) or an adjective (specifying that something has to do with setting up). So in this instance, the wording should be "set up". Other uses of "setup/set up" should be checked also. <RG> Edited at multiple places. As described in [RFC6689], an ASSOCIATION object can be used to identify the LSPs for restoration using Association Type set to "Recovery" [RFC4872] and also identify the LSPs for resource sharing using Association Type set to "Resource Sharing" [RFC4873]. The ordering of the phrases in this sentence is somewhat confusing because "using Association Type set to xxx" is a qualifier of "an ASSOCIATION object", yet the phrase "can be used to yyy" is between them. Clearer to say: As described in [RFC6689], an ASSOCIATION object with Association Type "Recovery" [RFC4872] can be used to identify the LSPs for restoration. Also, an ASSOCIATION object with Association Type "Resource Sharing" [RFC4873] can be used to identify the LSPs for resource sharing. <RG> Edited. -- Generally GMPLS end-to-end recovery schemes have the restoration LSP signaled after the failure has been detected and notified on the working LSP. Is "signaled" used here in a standard way for GMPLS? It seems that "the LSP is signaled" is to mean "the LSP is set up", but it took me some time to realize that. I am used to "X is signaled" meaning "a signal is sent to X". (There are many instances of this usage.) <RG> Used term “set up” at most places to be consistent. It would also be useful for the reader to know the difference between "protection", "restoration", and "recovery". I think that "protection" is anti-failure paths set up *before* any failure, "restoration" is anti-failure paths set up *after* a failure, and "recovery" includes both "protection" and "restoration". Is this standard terminology withing GMPLS, or should the reader be warned about it? <RG> Added Section 2. [RFC4427] defines terminology for the GMPLS recovery (protection and restoration). In non-packet transport networks, as working LSPs are typically signaled over a nominal path, What is the meaning of "nominal" here? ("nominal" has a number of meanings, some of which are largely contradictory.) can be reverted to the nominal path when the failure is repaired <RG> Replaced nominal with preferred. In this context, the meaning of "reverted" is made clear by the clause "when the failure is reparied..." -- as opposed to other uses of "reverted". In this document, procedures are reviewed for It's probably better to say "we review procedures for...". <RG> Edited. o When using end-to-end recovery with revertive mode, methods for LSP reversion and resource sharing are summarized in this document. A definition of "revert/revertive/reversion" would be useful. <RG> This is now elaborated in Section 3.2. RFC4427, section 11 has details. 2. Overview The GMPLS end-to-end recovery scheme, as defined in [RFC4872] and being considered in this document, "fully dynamic rerouting switches normal traffic to an alternate LSP that is not even partially established only after the working LSP failure occurs. The new alternate route is selected at the LSP head-end node, it may reuse resources of the failed LSP at intermediate nodes and may include additional intermediate nodes and/or links". It is awkward to visually coordinate the quotation marks in this paragraph. If it is important that the text is quoted from RFC 4872, given its length, it should be presented as a block-quote. If not, the quotation marks should be omitted and just the reference given. If the intention is to quote this text, it should be corrected so that it matches the passage from RFC 4872. In particular, the difference between "fully dynamic rerouting" (in the draft) and "Full LSP rerouting (or restoration)" needs to be resolved, as there might be a difference in meaning. The grammar does not join "The GMPLS end-to-end recovery scheme ..." and "... fully dynamic rerouting switches normal traffic". Perhaps something like: The GMPLS end-to-end recovery scheme, as defined in [RFC4872] and being considered in this document, switches normal traffic to an alternate LSP that is not even partially established only after the working LSP failure occurs. The new alternate route is selected at the LSP head-end node, it may reuse resources of the failed LSP at intermediate nodes and may include additional intermediate nodes and/or links. <RG> Edited the text. -- Two examples, 1+R and 1+1+R are described in the following sections. At this point in the text, it's not clear what category these items are examples *of*. They aren't single recovery situations, as one would expect of something labeled "example". They seem to be sub-categories of "The GMPLS end-to-end recovery scheme". So it would be better to use phrasing like "Two forms of end-to-end recovery, ..., are described in the following sections." or "Two end-to-end recovery schemes/situations ...". I assume that other variants of end-to-end recovery exist, and this draft is applicable to some/many/all of them. To guard against misunderstanding, it would be worth saying so by adding something like "Many other forms of end-to-end recovery exist, many of which [or whatever] can use these RSVP-TE signaling techniques." <RG> Edited text with above suggestions. Given that sections 2.1 and 2.2 form a pair of examples, it might be useful to distinguish them from "Resource Sharing By Restoration LSP" (which is not an example, and is not somehow an alternative to 1+R and 1+1+R) by renumbering the sections to: 2. Overview 2.1. Examples 2.1.1. 1+R Restoration 2.1.2. 1+1+R Restoration 2.2. Resource Sharing By Restoration LSP In that case, the introductory sentence "Two examples..." would move to the new section 2.1. <RG> Updated sections. Where do the names "1+R" and "1+1+R" come from and do they have meaning beyond being arbitrary labels? <RG> It is defined in this document. Also, given that the 1+1+R case is split into four sub-cases, it's not clear that the split between 1+R and 1+1+R is fundamental. It seems that there is an array of semi-independent choices: whether there is an ongoing protection LSP, how many restoration LSPs may be established (no more than the number of ongoing LSPs), how many failures of original LSPs must happen before restoration LSPs are established; various combinations of these choices yield various restoration techniques. Looked at that way, it might be worth combining both examples into one. But that has the problem that figure 2 looks considerably different from figure 1. OTOH, figure 2 isn't particularly accurate for the situation with two restoration LSPs, and perhaps those two cases should be split into another section with its own figure. <RG> Created section 3.1.2.1 and moved text there. 2.1. 1+R Restoration Unlike a protection LSP, a restoration LSP is signaled per need basis. Is "restoration" a standard word in this field? If not, there should be some sort of terminology section that states clearly the difference between "protection" and "restoration". <RG> Yes as per [RFC4427]. 2.2. 1+1+R Restoration This paragraph could use rewording to be clearer: After a failure detection and notification on a working LSP or protecting LSP, a third LSP on path A-H-I-J-Z is established as a restoration LSP. Since the working LSP has already been described, this should be "the working LSP". <RG> Edited the text. The restoration LSP in this case provides protection against a second order failure. It would probably be better to explain what the "second order failure" is: The restoration LSP in this case provides protection against failure of both the working and protecting LSPs. <RG> Edited the text. -- During failure switchover with 1+1+R recovery scheme, in general, failed LSP resources are not released so that working, protecting and restoration LSPs coexist in the network. Nonetheless, a restoration LSP with the working LSP it is restoring as well as a restoration LSP with the protecting LSP it is restoring can share network resources. For ease of reading, better to split the two cases apart, and not use "it is restoring" as we haven't introduced "restore" as a transitive verb: The restoration LSP can share network resources with the working LSP, and it can share network resources with the protecting LSP. <RG> Edited the text. -- Typically, restoration LSP is torn down when the failure on the original (working or protecting) LSP is repaired and the traffic is reverted to the original LSP. Strictly, Typically, the restoration LSP is torn down when both the working and protecting LSPs are repaired and the traffic is reverted to the original LSP. Except that's not correct, either. Probably the practice is that a restoration LSP is torn down when enough original LSPs are repaired to bring the failure count below the threshold that triggered the setting up of the restoration LSP (which varies among the four models). But that's awkward to write, even though that is the correct statement. <RG> Edited the text. -- In all models discussed, if the restoration LSP also fails, it is torn down and a new restoration LSP is signaled. You can't say "the restoration LSP" because some of the models have more than one. Better In all these models, if a restoration LSP also fails, it is torn down and a new restoration LSP is signaled. <RG> Edited the text. 2.3. Resource Sharing By Restoration LSP it allows for resource sharing when the LSP traffic is dynamically restored after the link failure The significance of this phrase isn't clear to me. One possible sense is that since the failure that is being discussed is the C-D link failure, then necessarily the resources from A to C can be reused. But that meaning doesn't work well here, because we haven't introduced what the failure is. (Also, you use the phrase "the link failure" before introducing what the link failure is.) It seems like the potential for resource sharing is a property of the LSP that it might not have, but the text doesn't point that out clearly as an assumption of the example. Perhaps Using the network shown in Figure 3 as an example, LSP1 (A-B-C-D-E) is the working LSP, and assume it allows for resource sharing when the LSP traffic is dynamically restored. <RG> Edited the text. -- In this case, A-B-C-F-G-E is chosen as the restoration LSP path and the resources on the path segment A-B-C are re-used by this LSP when the working LSP is not torn down (e.g. in 1+R recovery scheme). "when" isn't the right word here, because the re-using the resources doesn't wait for the working LSP to be not torn down. Perhaps: In this case, A-B-C-F-G-E is chosen as the restoration LSP path and the resources on the path segment A-B-C are re-used by this LSP. The working LSP is not torn down. <EG> Edited the text. 3.1. Restoration LSP Association For example, when a restoration LSP is signaled for a failed working LSP, the ASSOCIATION object in the restoration LSP contains the Association ID and Association Source set to the Association ID and Association Source signaled in the working LSP for the "Recovery" Association Type. As a general question, where does the association object live? Clearly it isn't "in the restoration LSP". It would be useful to mention this for readers who aren't fully familiar with the background: For example, when a restoration LSP is signaled for a failed working LSP, the ASSOCIATION object in the Path message that establishes the restoration LSP contains ... <RG> Edited the text at multiple places. 3.2. Resource Sharing-based Restoration LSP Setup As described in [RFC3209], Section 2.5, the purpose of make-before- break is "not to disrupt traffic, or adversely impact network operations while TE tunnel rerouting is in progress". In non-packet transport networks, the label has a mapping into the data plane resource used and the nodes along the LSP need to send triggering commands to data plane for setting up cross-connections accordingly during the RSVP-TE signaling procedure. Due to the nature of the non-packet transport networks, a node may not be able to fulfill this purpose when sharing resources in some scenarios. I can understand this paragraph, but I think it could benefit from a number of edits. The first is to remove the quotation marks, since the purpose is not to emphasize that RFC 3209 said those words, but rather that 3209 stated the same concept. And I think some of the explanation can be omitted without losing clarity. As described in [RFC3209], Section 2.5, the purpose of make-before- break is not to disrupt traffic, or adversely impact network operations while TE tunnel rerouting is in progress. In non-packet transport networks during the RSVP-TE setup procedure, the nodes along the LSP set up cross-connections accordingly. Because a cross-connection cannot simultaneously connect a shared resource to different resources in two alternative LSPs, nodes may not be able to fulfill this promise when LSPs share resources. <RG> Edited the text. -- ---------+--------------------------------------------------------- Category | Node Behavior during Restoration LSP Setup ---------+--------------------------------------------------------- C1 + Reusing existing resource on both input and output + interfaces (nodes A & B in Figure 3). + + This type of node needs to book the existing + resources and no cross-connection setup + command is needed. ---------+--------------------------------------------------------- This would be prettier if most of the +'s were turned into |'s: <RG> Edited the table. ---------+--------------------------------------------------------- Category | Node Behavior during Restoration LSP Setup ---------+--------------------------------------------------------- C1 | Reusing existing resource on both input and output | interfaces (nodes A & B in Figure 3). | | This type of node needs to book the existing | resources and no cross-connection setup | command is needed. ---------+--------------------------------------------------------- Note that the items in the second column of the table are composed of two parts: The first part is condition that defines which nodes are in that category, and the second part is the actions that will be taken by such nodes. Ideally, these would be broken out as separate columns. (The current first column provides the labels C1, C2, and C3, but those aren't references anywhere in the document, and could be omitted to save space.) That revises the table to look like this: ------------------------------------+------------------------------ Situation | Actions ------------------------------------+------------------------------ Reusing existing resources | Book the existing resources. on both input and output interfaces | No cross-connection setup is (nodes A & B in Figure 3). | needed. ------------------------------------+------------------------------ Reusing existing resource only on | Book the resources. one of the interfaces (either input | Re-configure the cross-connection or output) and uses new resource on | to connect the re-used resource the other interface. | to the new resource. (nodes C & E in Figure 3). | ------------------------------------+------------------------------ Using new resources on both | Book the new resources. interfaces. | Send the cross-connection setup (nodes F & G in Figure 3). | command on both interfaces. ------------------------------------+------------------------------ <RG> Edited the table. Is the meaning of "book" well-known? I find no use of it elsewhere in this document or in any of the references. <RG> Replaced “book” with “reserve”. Depending on whether the resource is re-used or not, the node behaviors differ. Of course, the different behavior is only because we are here optimizing the establishment of the new LSP. A node could send a command to cross-connect two resources that are already connected. This deviates from normal LSP setup since some nodes do not need to re-configure the cross-connection, and it should not be viewed as an error. Why would this (not sending a command to connect things that are already connected) be considered an error under any circumstances? <RG> Removed the line to avoid confusion. 3.3. LSP Reversion Is "reversion" a standard term? <RG> Yes. RFC4427, Section 4.11. If the end-to-end LSP recovery is revertive, as described in Section 2 ... I'm not sure how the phrase "If the end-to-end LSP recovery is revertive" works. "Recovery" seems to be a general term for techniques to recover from link failures and the like. Is this describing a "revertive" recovery method, or is it describing an instance of recovery which is somehow "revertive"? Compare to "revert", which seems to be the action of putting the traffic back on the original/protection LSP once its functionality is restored. I would expect that behavior to be universal. <RG> Edited the text. 1. Make-while-break Reversion, where resources associated with a working or protecting LSP are reconfigured while removing reservations for the restoration LSP. It's not clear to me what sort of reconfiguring is being discussed. Assuming that "reversion" means "when the working/protecting LSP starts working again, traffic is restored to that path", its not clear what sort of reconfiguration would be needed, as the working/protecting LSP already exists. I suspect that this issue shows up when the working/protecting LSP shares resources with the restoration LSP, and moving traffic to the restoration LSP may require reconfiguring resources, and so moving traffic back to working/protecting LSP may require reversing that reconfiguration. But the initial reconfiguration has not been mentioned. Should some sort of general description be put in "Resource Sharing By Restoration LSP" of the possible need to reconfigure when moving traffic to or from a restoration LSP? (This is all rather obvious, but it would help if it was clearly described.) <RG> Added text in Section 3.2. 3.3.1. Make-while-break Reversion Removing reservations for restoration LSP triggers reconfiguration of resources associated with a working or protecting LSP on every node where resources are shared. Could you add an explanation or pointer why this is so? It seems that for this to be true, the reservation process must broadcast an explicit prioritization between the new (restorative) reservation and the old (working) reservation, because the node that is reconfigured has to remember both reservations, and revert to the working one when the restorative one is deleted. It'd be useful for the naive reader to know where in RSVP-TE that information is broadcast and/or how RSVP-TE specified that nodes have to remember that information. <RG> Added text to state that working LSP states not torn down. Deletion of restoration LSPs is not a revertive process. What is the meaning of "revertive process" here? It doesn't seem to match the sense of "revertive" as used elsewhere. <RG> Removed this line to avoid confusion. In particular, if RSVP packets are lost due to nodal or DCN failures it is possible for an LSP to be only partially deleted. "nodal" should probably be "node". What is "DCN"? I can't find it in any of the referenced RFCs. Does "link" work as a replacement? <RG> Corrected the text. 3.3.2. Make-before-break Reversion Instead of relying on deletion of restoration LSP, the head-end chooses to establish a new LSP to reconfigure resources on the working or protection LSP path, and uses identical ASSOCIATION and PROTECTION objects from the LSP it is replacing. This could be made clearer by consistently labeling the enw LSP as the "reversion" LSP. Also, state explicitly that its resources exactly duplicate the resources of the working/protection LSP that is being reverted: Instead of relying on deletion of the restoration LSP, the head-end chooses to establish a new "reversion" LSP that duplicates the configuration of the resources on the working or protection LSP, and uses identical ASSOCIATION and PROTECTION objects for that LSP. <RG> Edited the text. -- Reversion LSP is sharing resources both with working and restoration LSPs. Better The reversion LSP shares all of the resources of the working/protection LSP and may share resources with the restoration LSP. <RG> Edited the text. -- Hence, after reversion LSP is created, data plane configuration essentially reflects working or protecting LSP reservations. It seems like "essentially" is not needed, because the data plane configuration will *exactly* reflect the working/protecting LSP reservations. Or are there minor variations in how reservations are done that may not be exactly duplicated by the reversion LSP? <RG> Edited the text. After "make" part is finished, working and restoration LSPs are torn down. Perhaps emphasize "the original working/protection and restoration LSPs are torn down", as the reversion LSP becomes the new working/protection LSP. <RG> Edited the text. o Rollback If "make" part fails, (existing) restoration LSP will still be used to carry existing traffic. Same logic applies here as for any MBB operation failure. The reasoning here is not clear to me. If the "make" operation fails, some of the nodes may be configured for the restoration LSP, while others will be configured for the restoration LSP. Or is it implicit that creating LSPs is an atomic operation network-wide, that incomplete LSP creations will be completely purged from the network? If the latter is true, then the core of this discussion is that creating LSPs is atomic across the network, but *deleting* LSPs is not (and so make-while-break can fail to work). If that difference is true, it should be said explicitly somewhere near the beginning of section 3.3, as that fact is what is driving the whole discussion. <RG> This is because the original restoration LSP is not torn down in this (MBB) case (as opposed the MWB). But yes, the node will need to be reconfigured if needed. Thanks, Rakesh (for authors and contributors) [END]
TEAS Working Group X. Zhang Internet-Draft H. Zheng, Ed. Intended Status: Informational Huawei Technologies Expires: July 18, 2017 R. Gandhi, Ed. Z. Ali Cisco Systems, Inc. P. Brzozowski ADVA Optical January 14, 2017 RSVP-TE Signaling Procedure for End-to-End GMPLS Restoration and Resource Sharing draft-ietf-teas-gmpls-resource-sharing-proc-07 Abstract In non-packet transport networks, there are requirements where Generalized Multi-Protocol Label Switching (GMPLS) end-to-end recovery scheme needs to employ restoration Label Switched Path (LSP) while keeping resources for the working and/or protecting LSPs reserved in the network after the failure occurs. This document reviews how the LSP association is to be provided using Resource Reservation Protocol - Traffic Engineering (RSVP-TE) signaling in the context of GMPLS end-to-end recovery scheme when using restoration LSP where failed LSP is not torn down. In addition, this document discusses resource sharing-based setup and teardown of LSPs as well as LSP reversion procedures. No new signaling extensions are defined by this document, and it is strictly informative in nature. Status of this Memo This Internet-Draft is submitted to IETF in full conformance with the provisions of BCP 78 and BCP 79. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." Zhang, et al Expires July 18, 2017 [Page 1] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt. The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Copyright Notice Copyright (c) 2017 IETF Trust and the persons identified as the document authors. All rights reserved. This document is subject to BCP 78 and the IETF Trust's Legal Provisions Relating to IETF Documents (http://trustee.ietf.org/license-info) in effect on the date of publication of this document. Please review these documents carefully, as they describe your rights and restrictions with respect to this document. Code Components extracted from this document must include Simplified BSD License text as described in Section 4.e of the Trust Legal Provisions and are provided without warranty as described in the Simplified BSD License. Table of Contents 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 2. Conventions Used in This Document . . . . . . . . . . . . . . 4 2.1. Terminology . . . . . . . . . . . . . . . . . . . . . . . 4 2.2. Acronyms and Abbreviations . . . . . . . . . . . . . . . . 4 3. Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1. Examples of Restoration Schemes . . . . . . . . . . . . . 5 3.1.1. 1+R Restoration . . . . . . . . . . . . . . . . . . . 5 3.1.2. 1+1+R Restoration . . . . . . . . . . . . . . . . . . 5 3.1.2.1. 1+1+R Restoration - Variants . . . . . . . . . . . 6 3.2. Resource Sharing By Restoration LSP . . . . . . . . . . . 7 4. RSVP-TE Signaling Procedure . . . . . . . . . . . . . . . . . 7 4.1. Restoration LSP Association . . . . . . . . . . . . . . . 7 4.2. Resource Sharing-based Restoration LSP Setup . . . . . . . 8 4.3. LSP Reversion . . . . . . . . . . . . . . . . . . . . . . 9 4.3.1. Make-while-break Reversion . . . . . . . . . . . . . . 10 4.3.2. Make-before-break Reversion . . . . . . . . . . . . . 11 5. Security Considerations . . . . . . . . . . . . . . . . . . . 12 6. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 12 7. References . . . . . . . . . . . . . . . . . . . . . . . . . . 13 7.1. Normative References . . . . . . . . . . . . . . . . . . . 13 7.2. Informative References . . . . . . . . . . . . . . . . . . 13 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . 14 Contributors . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 15 Zhang, et al Expires July 18, 2017 [Page 2] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 1. Introduction Generalized Multi-Protocol Label Switching (GMPLS) [RFC3945] defines a set of protocols, including Open Shortest Path First - Traffic Engineering (OSPF-TE) [RFC4203] and Resource ReserVation Protocol - Traffic Engineering (RSVP-TE) [RFC3473]. These protocols can be used to set up Label Switched Paths (LSPs) in non-packet transport networks. The GMPLS protocol extends MPLS to support interfaces capable of Time Division Multiplexing (TDM), Lambda Switching and Fiber Switching. These switching technologies provide several protection schemes [RFC4426][RFC4427] (e.g., 1+1, 1:N and M:N). Resource Reservation Protocol - Traffic Engineering (RSVP-TE) signaling has been extended to support various GMPLS recovery schemes, such as end-to-end recovery [RFC4872] and segment recovery [RFC4873]. As described in [RFC6689], an ASSOCIATION object with Association Type "Recovery" [RFC4872] can be signaled in the RSVP Path message to identify the LSPs for restoration. Also, an ASSOCIATION object with Association Type "Resource Sharing" [RFC4873] can be signaled in the RSVP Path message to identify the LSPs for resource sharing. [RFC6689] Section 2.2 reviews the procedure for providing LSP associations for GMPLS end-to-end recovery and Section 2.4 reviews the procedure for providing LSP associations for sharing resources. Generally GMPLS end-to-end recovery schemes have the restoration LSP set up after the failure has been detected and notified on the working LSP. For recovery scheme with revertive behaviour, a restoration LSP is set up while working LSP and/or protecting LSP are not torn down in control plane due to a failure. In non-packet transport networks, as working LSPs are typically set up over preferred paths, service providers would like to keep resources associated with the working LSPs reserved. This is to make sure that the service can be reverted to the preferred path (working LSP) when the failure is repaired to provide deterministic behavior and guaranteed Service Level Agreement (SLA). In this document, we review procedures for GMPLS LSP associations, resource sharing based LSP setup, teardown, and LSP reversion for non-packet transport networks, including the following: o Review the procedure for providing LSP associations for the GMPLS end-to-end recovery using restoration LSP where working and protecting LSPs are not torn down and resources are kept reserved in the network after the failure. o In [RFC3209], the make-before-break (MBB) method assumes the old and new LSPs share the SESSION object and signal Shared Explicit Zhang, et al Expires July 18, 2017 [Page 3] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 (SE) flag in SESSION_ATTRIBUTE object for sharing resources. According to [RFC6689], an ASSOCIATION object with Association Type "Resource Sharing" in the Path message enables the sharing of resources across LSPs with different SESSION objects. The procedure for resource sharing using the SE flag in conjunction with an ASSOCIATION object is discussed in this document. o When using end-to-end recovery scheme with revertive behavior, methods for LSP reversion and resource sharing are summarized in this document. This document is strictly informative in nature and does not define any RSVP-TE signaling extensions. 2. Conventions Used in This Document 2.1. Terminology The reader is assumed to be familiar with the terminology in [RFC3209], [RFC3473], [RFC4872], [RFC4873] and [RFC4427]. 2.2. Acronyms and Abbreviations GMPLS: Generalized Multi-Protocol Label Switching LSP: An MPLS Label Switched Path MBB: Make Before Break MPLS: Multi-Protocol Label Switching RSVP: Resource ReSerVation Protocol SE: Shared Explicit flag TDM: Time Division Multiplexing TE: Traffic Engineering 3. Overview The GMPLS end-to-end recovery scheme, as defined in [RFC4872] and being considered in this document, switches normal traffic to an alternate LSP that is not even partially established only after the working LSP failure occurs. The new alternate route is selected at Zhang, et al Expires July 18, 2017 [Page 4] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 the LSP head-end node, it may reuse resources of the failed LSP at intermediate nodes and may include additional intermediate nodes and/or links. 3.1. Examples of Restoration Schemes Two forms of end-to-end recovery schemes, 1+R restoration and 1+1+R restoration are described in the following sections. Other forms of end-to-end recovery schemes also exist and they can use these signaling techniques. 3.1.1. 1+R Restoration One example of the recovery scheme considered in this document is 1+R recovery. The 1+R recovery scheme is exemplified in Figure 1. In this example, a working LSP on path A-B-C-Z is pre-established. Typically after a failure detection and notification on the working LSP, a second LSP on path A-H-I-J-Z is established as a restoration LSP. Unlike a protecting LSP which is set up before the failure, a restoration LSP is set up per need basis, after the failure. +-----+ +-----+ +-----+ +-----+ | A +----+ B +-----+ C +-----+ Z | +--+--+ +-----+ +-----+ +--+--+ \ / \ / +--+--+ +-----+ +--+--+ | H +-------+ I +--------+ J | +-----+ +-----+ +-----+ Figure 1: An Example of 1+R Recovery Scheme During failure switchover with 1+R recovery scheme, in general, working LSP resources are not released so that working and restoration LSPs coexist in the network. Nonetheless, working and restoration LSPs can share network resources. Typically when the failure has recovered on the working LSP, the restoration LSP is no longer required and is torn down while the traffic is reverted to the original working LSP. 3.1.2. 1+1+R Restoration Another example of the recovery scheme considered in this document is 1+1+R. In 1+1+R, a restoration LSP is set up for the working LSP and/or the protecting LSP after the failure has been detected, and this recovery scheme is exemplified in Figure 2. Zhang, et al Expires July 18, 2017 [Page 5] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 +-----+ +-----+ +-----+ | D +-------+ E +--------+ F | +--+--+ +-----+ +--+--+ / \ / \ +--+--+ +-----+ +-----+ +--+--+ | A +----+ B +-----+ C +-----+ Z | +--+--+ +-----+ +-----+ +--+--+ \ / \ / +--+--+ +-----+ +--+--+ | H +-------+ I +--------+ J | +-----+ +-----+ +-----+ Figure 2: An Example of 1+1+R Recovery Scheme In this example, a working LSP on path A-B-C-Z and a protecting LSP on path A-D-E-F-Z are pre-established. After a failure detection and notification on the working LSP or protecting LSP, a third LSP on path A-H-I-J-Z is established as a restoration LSP. The restoration LSP in this case provides protection against failure of both the working and protecting LSPs. During failure switchover with 1+1+R recovery scheme, in general, failed LSP resources are not released so that working, protecting and restoration LSPs coexist in the network. The restoration LSP can share network resources with the working LSP, and it can share network resources with the protecting LSP. Typically, the restoration LSP is torn down when the traffic is reverted to the original LSP and it is no longer needed. There are two possible models when using a restoration LSP with 1+1+R recovery scheme: o A restoration LSP is set up after either a working or protecting LSP fails. Only one restoration LSP is present at a time. o A restoration LSP is set up after both working and protecting LSPs fail. Only one restoration LSP is present at a time. 3.1.2.1. 1+1+R Restoration - Variants Two other possible variants exist when using a restoration LSP with 1+1+R recovery scheme: o A restoration LSP is set up after either a working or protecting LSP fails. Two different restoration LSPs may be present, one for the working LSP and one for the protecting LSP. Zhang, et al Expires July 18, 2017 [Page 6] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 o Two different restoration LSPs are set up after both working and protecting LSPs fail, one for the working LSP and one for the protecting LSP. In all these models, if a restoration LSP also fails, it is torn down and a new restoration LSP is set up. 3.2. Resource Sharing By Restoration LSP +-----+ +-----+ | F +------+ G +--------+ +--+--+ +-----+ | | | | | +-----+ +-----+ +--+--+ +-----+ +--+--+ | A +----+ B +-----+ C +--X---+ D +-----+ E | +-----+ +-----+ +-----+ +-----+ +-----+ Figure 3: Resource Sharing in 1+R Recovery Scheme Using the network shown in Figure 3 as an example using 1+R recovery scheme, LSP1 (A-B-C-D-E) is the working LSP, and assume it allows for resource sharing when the LSP traffic is dynamically restored. Upon detecting the failure of a link along the LSP1, e.g. Link C-D, node A needs to decide which alternative path it will use to signal restoration LSP and reroute traffic. In this case, A-B-C-F-G-E is chosen as the restoration LSP path and the resources on the path segment A-B-C are re-used by this LSP. The working LSP is not torn down and co-exists with the restoration LSP. Nodes A and B reconfigure the resources to set up the restoration LSP by sending cross-connection command to the data plane. In the recovery scheme employing revertive behavior, after the failure is repaired, the resources on nodes A and B need to be reconfigured to set up the working LSP. The traffic is then reverted back to the original working LSP. 4. RSVP-TE Signaling Procedure 4.1. Restoration LSP Association Where GMPLS end-to-end recovery scheme needs to employ a restoration LSP while keeping resources for the working and/or protecting LSPs Zhang, et al Expires July 18, 2017 [Page 7] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 reserved in the network after the failure, the restoration LSP is set up with an ASSOCIATION object that has Association Type set to "Recovery" [RFC4872], the Association ID and the Association Source set to the corresponding Association ID and the Association Source signaled in the Path message of the LSP it is restoring. For example, when a restoration LSP is signaled for a failed working LSP, the ASSOCIATION object in the Path message of the restoration LSP contains the Association ID and Association Source set to the Association ID and Association Source signaled in the working LSP for the "Recovery" Association Type. Similarly, when a restoration LSP is set up for a failed protecting LSP, the ASSOCIATION object in the Path message of the restoration LSP contains the Association ID and Association Source set to the Association ID and Association Source signaled in the protecting LSP for the "Recovery" Association Type. The procedure for signaling the PROTECTION object is specified in [RFC4872]. Specifically, the restoration LSP used for a working LSP is set up with P bit cleared in the PROTECTION object in the Path message of the restoration LSP and the restoration LSP used for a protecting LSP is set up with P bit set in the PROTECTION object in the Path message of the restoration LSP. 4.2. Resource Sharing-based Restoration LSP Setup GMPLS LSPs can share resources during LSP setup if they have Shared Explicit (SE) flag set in the SESSION_ATTRIBUTE objects [RFC3209] in the Path messages that create them and: o As defined in [RFC3209], LSPs have identical SESSION objects and/or o As defined in [RFC6689], LSPs have matching ASSOCIATION object with Association Type set to "Resource Sharing" signaled in their Path messages. LSPs in this case can have different SESSION objects i.e. different Tunnel ID, Source and/or Destination signaled in their Path messages. As described in [RFC3209], Section 2.5, the purpose of make-before- break is not to disrupt traffic, or adversely impact network operations while TE tunnel rerouting is in progress. In non-packet transport networks during the RSVP-TE signaling procedure, the nodes set up cross-connections along the LSP accordingly. Because the cross-connection cannot simultaneously connect a shared resource to different resources in two alternative LSPs, nodes may not be able to fulfill this request when LSPs share resources. For LSP restoration upon failure, as explained in Section 11 of [RFC4872], the reroute procedure may re-use existing resources. The Zhang, et al Expires July 18, 2017 [Page 8] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 action of the intermediate nodes during the rerouting process to reconfigure cross-connections does not further impact the traffic since it has been interrupted due to the already failed LSP. The node actions for setting up the restoration LSP can be categorized into the following: -----------------------------------+--------------------------------- | Category | Action | -----------------------------------+--------------------------------- | Reusing existing resource on | This type of node needs to | | both input and output interfaces | reserve the existing resources | | (nodes A & B in Figure 3). | and no cross-connection | | | command is needed. | -----------------------------------+--------------------------------- | Reusing existing resource only | This type of node needs to | | on one of the interfaces, either | reserve the resources and send | | input or output interfaces and | the re-configuration | | using new resource on the | cross-connection command to its| | other interfaces. | corresponding data plane | | (nodes C & E in Figure 3). | node on the interfaces where | | | new resources are needed and | | | it needs to re-use the existing| | | resources on the other | | | interfaces. | -----------------------------------+--------------------------------- | Using new resources on both | This type of node needs to | | interfaces. | reserve the new resources | | (nodes F & G in Figure 3). | and send the cross-connection | | | command on both interfaces. | -----------------------------------+--------------------------------- Table 1: Node Actions During Restoration LSP Setup Depending on whether the resource is re-used or not, the node actions differ. This deviates from normal LSP setup since some nodes do not need to re-configure the cross-connection. Also, the judgment whether the control plane node needs to send a cross-connection setup or modification command to its corresponding data plane node(s) relies on the check whether the LSPs are sharing resources. 4.3. LSP Reversion If the end-to-end LSP recovery scheme employs the revertive behavior, as described in Section 3 of this document, traffic can be reverted from the restoration LSP to the working or protecting LSP after its failure is recovered. The LSP reversion can be achieved using two Zhang, et al Expires July 18, 2017 [Page 9] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 methods: 1. Make-while-break Reversion, where resources associated with a working or protecting LSP are reconfigured while removing reservations for the restoration LSP. 2. Make-before-break Reversion, where resources associated with a working or protecting LSP are reconfigured before removing reservations for the restoration LSP. In non-packet transport networks, both of the above reversion methods will result in some traffic disruption when the restoration LSP and the LSP being restored are sharing resources and the cross-connections need to be reconfigured on intermediate nodes. 4.3.1. Make-while-break Reversion In this reversion method, restoration LSP is simply requested to be deleted by the head-end. Removing reservations for restoration LSP triggers reconfiguration of resources associated with a working or protecting LSP on every node where resources are shared. The working or protecting LSP state was not removed from the nodes when the failure occurred. Whenever reservation for restoration LSP is removed from a node, data plane configuration changes to reflect reservations of working or protecting LSP as signaling progresses. Eventually, after the whole restoration LSP is deleted, data plane configuration will fully match working or protecting LSP reservations on the whole path. Thus reversion is complete. Make-while-break, while being relatively simple in its logic, has a few limitations as follows which may not be acceptable in some networks: o No rollback If for some reason reconfiguration of data plane on one of the nodes to match working or protecting LSP reservations fails, falling back to restoration LSP is no longer an option, as its state might have already been removed from other nodes. o No completion guarantee Deletion of an LSP provides no guarantees of completion. In particular, if RSVP packets are lost due to a node or link failure it is possible for an LSP to be only partially deleted. To mitigate this, RSVP could maintain soft state reservations and hence eventually remove remaining reservations due to refresh timeouts. This approach is not feasible in non-packet transport networks Zhang, et al Expires July 18, 2017 [Page 10] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 however, where control and data channels are often separated and hence soft state reservations are not useful. Finally, one could argue that graceful LSP deletion [RFC3473] would provide guarantee of completion. While this is true for most cases, many implementations will time out graceful deletion if LSP is not removed within certain amount of time, e.g. due to a transit node fault. After that, deletion procedures which provide no completion guarantees will be attempted. Hence, in corner cases a completion guarantee cannot be provided. o No explicit notification of completion to head-end node In some cases, it may be useful for a head-end node to know when the data plane has been reconfigured to match working or protecting LSP reservations. This knowledge could be used for initiating operations like enabling alarm monitoring, power equalization and others. Unfortunately, for the reasons mentioned above, make-while-break reversion lacks such explicit notification. 4.3.2. Make-before-break Reversion This reversion method can be used to overcome limitations of make-while-break reversion. It is similar in spirit to MBB concept used for re-optimization. Instead of relying on deletion of the restoration LSP, the head-end chooses to establish a new reversion LSP that duplicates the configuration of the resources on the working or protecting LSP, and uses identical ASSOCIATION and PROTECTION objects in the Path message of that LSP. Only if setup of this LSP is successful will other (restoration and working or protecting) LSPs be deleted by the head-end. MBB reversion consists of two parts: A) Make part: Creating a new reversion LSP following working or protecting LSP's path. The reversion LSP shares all of the resources of the working or protecting LSP and may share resources with the restoration LSP. As reversion LSP is created, resources are reconfigured to match its reservations. Hence, after reversion LSP is created, data plane configuration reflects working or protecting LSP reservations. B) Break part: After "make" part is finished, the original working or protecting and restoration LSPs are torn down, and the reversion LSP becomes the new working or protecting LSP. Removing reservations for working or restoration LSPs does not cause any resource reconfiguration on reversion LSP's path - nodes follow same procedures as for "break" Zhang, et al Expires July 18, 2017 [Page 11] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 part of any MBB operation. Hence, after working or protecting and restoration LSPs are removed, data plane configuration is exactly the same as before starting restoration. Thus, reversion is complete. MBB reversion uses make-before-break characteristics to overcome challenges related to make-while-break reversion as follow: o Rollback If "make" part fails, (existing) restoration LSP will still be used to carry existing traffic as the restoration LSP state was not removed. Same logic applies here as for any MBB operation failure. o Completion guarantee LSP setup is resilient against RSVP message loss, as Path and Resv messages are refreshed periodically. Hence, given that network recovers from node and link failures eventually, reversion LSP setup is guaranteed to finish with either success or failure. o Explicit notification of completion to head-end node Head-end knows that data plane has been reconfigured to match working or protecting LSP reservations on intermediate nodes when it receives Resv for the reversion LSP. 5. Security Considerations This document reviews procedures defined in [RFC3209] [RFC4872] [RFC4873] and [RFC6689] and does not define any new procedure. This document does not introduce any new security issues other than those already covered in [RFC3209] [RFC4872] [RFC4873] and [RFC6689]. 6. IANA Considerations This informational document does not make any request for IANA action. Zhang, et al Expires July 18, 2017 [Page 12] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 7. References 7.1. Normative References [RFC3209] Awduche, D., Berger, L., Gan, D., Li, T., Srinivasan, V., and G. Swallow, "RSVP-TE: Extensions to RSVP for LSP Tunnels", RFC 3209, December 2001. [RFC3473] Berger, L., Ed., "Generalized Multi-Protocol Label Switching (GMPLS) Signaling Resource ReserVation Protocol-Traffic Engineering (RSVP-TE) Extensions", RFC 3473, January 2003. [RFC4872] Lang, J., Ed., Rekhter, Y., Ed., and D. Papadimitriou, Ed., "RSVP-TE Extensions in Support of End-to-End Generalized Multi-Protocol Label Switching (GMPLS) Recovery", RFC 4872, May 2007. [RFC4873] Berger, L., Bryskin, I., Papadimitriou, D., and A. Farrel, "GMPLS Segment Recovery", RFC 4873, May 2007. [RFC6689] L. Berger, "Usage of the RSVP ASSOCIATION Object", RFC 6689, July 2012. 7.2. Informative References [RFC3945] Mannie, E., "Generalized Multi-Protocol Label Switching (GMPLS) Architecture", RFC 3945, October 2004. [RFC4203] Kompella, K., and Rekhter, Y., "OSPF Extensions in Support of Generalized Multi-Protocol Label Switching (GMPLS)", RFC 4203, October 2005. [RFC4426] Lang, J., Rajagopalan, B., and Papadimitriou, D., "Generalized Multiprotocol Label Switching (GMPLS) Recovery Functional Specification", RFC 4426, March 2006. [RFC4427] Mannie, E., and Papadimitriou, D., "Recovery (Protection and Restoration) Terminology for Generalized Multi-Protocol Label Switching", RFC 4427, March 2006. Zhang, et al Expires July 18, 2017 [Page 13] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 Acknowledgements The authors would like to thank George Swallow for the discussions on the GMPLS restoration. The authors would like to thank Lou Berger for the guidance on this work. The authors would also like to thank Lou Berger, Vishnu Pavan Beeram and Christian Hopps for reviewing this document and providing valuable comments. Contributors Gabriele Maria Galimberti Cisco Systems, Inc. EMail: ggalimbe@xxxxxxxxx Zhang, et al Expires July 18, 2017 [Page 14] Internet-Draft GMPLS Restoration and Resource Sharing January 14, 2017 Authors' Addresses Xian Zhang Huawei Technologies F3-1-B R&D Center, Huawei Base Bantian, Longgang District Shenzhen 518129 P.R.China EMail: zhang.xian@xxxxxxxxxx Haomian Zheng (editor) Huawei Technologies F3-1-B R&D Center, Huawei Base Bantian, Longgang District Shenzhen 518129 P.R.China EMail: zhenghaomian@xxxxxxxxxx Rakesh Gandhi (editor) Cisco Systems, Inc. EMail: rgandhi@xxxxxxxxx Zafar Ali Cisco Systems, Inc. EMail: zali@xxxxxxxxx Pawel Brzozowski ADVA Optical EMail: PBrzozowski@xxxxxxxxxxxxxxx Zhang, et al Expires July 18, 2017 [Page 15]
<<< text/html; name="Diff_ draft-ietf-teas-gmpls-resource-sharing-proc-06.txt - draft-ietf-teas-gmpls-resource-sharing-proc-07.html": Unrecognized >>>