Joel, > Ib my experience (and I did charter the PIM and MPLS working groups in > ancient history), I think you are forgiven by now :). > the distinction between pt2mp for MPLS and IP > Multicast are not about scale. Admittedly, but we have experiences of existing networks, I'd say that there are many more "leaves" in a multicast service network, than in an MPLS p2mp network. I still think it would be good to know how many MPLS "leaves" it takes to cause control plane overload, but since BFD control messages might not be the only factor, a definitive answer might not be possible. > I do not think any of the defining > documents deal with what scale they aim at. IP Multicast includes SSM, > which is pt-2-mpt, and ASM, which is mp-2-mp. > > I actually tend to doubt that the BFD error indications will cause data > plane congestion, given the relative rates and expected scales. But it > is our job to say so. Then other folks can decide wheether we have > sufficiently addressed the questions. I don't see that we disagree. > > Yours, > > Joel > > On 2/24/2024 10:56 PM, loa@xxxxx wrote: >> Greg, Joel, all >> >> Traditionally we have distinguished between "p2mp" for MPLS, and >> "multicast" for IP. An IP multicast service might easily reach a "large >> number of leaves", while MPLS p2mp is more of an "transport" service >> where the number of leaves are moderate. >> >> I'm not saying that that "moderate number" might not cause the problems >> Greg and Joel discusses, but it might be an idea to think a bit about >> the scale. How many leaves is required to cause: >> >> - data plane congestion? >> - control plane overload? >> >> Currently I don't see any data plane problems (correct me if I'm wrong), >> while control plane overload is a possibility. >> >> /Loa >> >> >>> Mostly.� THere is one other aspect.� You may consider it >>> irrelevant, in >>> which case we can simply say so.� Can the inbound notifications >>> coming >>> from a large number of leaves at the same time cause data plane >>> congestion? >>> >>> Yours, >>> >>> Joel >>> >>> On 2/24/2024 8:44 PM, Greg Mirsky wrote: >>>> Hi Joel, >>>> thank you for your quick response. I consider two risks that may >>>> stress the root's control plane: >>>> >>>> * notifications transmitted by the leaves reporting the failure of >>>> the p2mp LSP >>>> * notifications transmitted by the root to every leave closing the >>>> Poll sequence >>>> >>>> As I understand it, you refer to the former as inbound congestion. The >>>> latter - outbound. Is that correct? I agree that even the inbound >>>> stream of notifications may overload the root's control plane. And the >>>> outbound process further increases the probability of the congestion >>>> in the control plane. My proposal is to apply a rate limiter to >>>> control inbound flow of BFD Control messages punted to the control >>>> plane. >>>> What would you suggest in addition to the proposed text? >>>> >>>> Best regards, >>>> Greg >>>> >>>> On Sat, Feb 24, 2024 at 3:28â�¯PM Joel Halpern >>>> <jmh.direct@xxxxxxxxxxxxxxx> wrote: >>>> >>>> What you say makes sense.� I think we need to acknowledge the >>>> inbound congestion risk, even if we choose not to try to >>>> ameliorate it.� Your approaches seems to address the outbound >>>> congestion risk from the root. >>>> >>>> YOurs, >>>> >>>> Joel >>>> >>>> On 2/24/2024 6:25 PM, Greg Mirsky wrote: >>>>> Hi Joel, >>>>> thank you for the clarification. My idea is to use a rate >>>>> limiter >>>>> at the root of the p2mp LSP that may receive� notifications >>>>> from >>>>> the leaves affected by the failure. I imagine that the threshold >>>>> of the rate limiter might be exceeded and the notifications will >>>>> be discarded. As a result, some notifications will be processed >>>>> by the headend of the p2mp BFD session later, as the tails >>>>> transmit notifications periodically until the receive the BFD >>>>> Control message with the Final flag set.� Thus, we cannot >>>>> avoid >>>>> the congestion but mitigate the negative effect it might cause >>>>> by >>>>> extending the convergence. Does that make sense? >>>>> >>>>> Regards, >>>>> Greg >>>>> >>>>> On Sat, Feb 24, 2024 at 2:39â�¯PM Joel Halpern >>>>> <jmh@xxxxxxxxxxxxxxx> wrote: >>>>> >>>>> That covers part of my concern.� But....� A failure >>>>> near the >>>>> root means that a lot of leaves will see failure, and they >>>>> will all send notifications converging on the root.� >>>>> Those >>>>> notifications themselves, not just the final messages, seem >>>>> able to cause congestion.� I am not sure what can be done >>>>> about it, but we aren't allowed to ignore it. >>>>> >>>>> Yours, >>>>> >>>>> Joel >>>>> >>>>> On 2/24/2024 3:34 PM, Greg Mirsky wrote: >>>>>> Hi Joel, >>>>>> thank you for your support of this work and the suggestion. >>>>>> Would the following update of the last paragraph� of >>>>>> Section >>>>>> 5 help: >>>>>> OLD TEXT: >>>>>> � � An ingress LSR that has received the BFD Control >>>>>> packet, >>>>>> as described >>>>>> � � above, sends the unicast IP/UDP encapsulated BFD >>>>>> Control >>>>>> packet with >>>>>> � � the Final (F) bit set to the egress LSR. >>>>>> NEW TEXT: >>>>>> � � As described above, an ingress LSR that has >>>>>> received the >>>>>> BFD Control >>>>>> � � packet sends the unicast IP/UDP encapsulated BFD >>>>>> Control >>>>>> packet with >>>>>> � � the Final (F) bit set to the egress LSR.� In >>>>>> some >>>>>> scenarios, e.g., >>>>>> � � when a p2mp LSP is broken close to its root, and >>>>>> the >>>>>> number of egress >>>>>> � � LSRs is significantly large, the control plane of >>>>>> the >>>>>> ingress LSR >>>>>> � � might be congested by the BFD Control packets >>>>>> transmitted >>>>>> by egress >>>>>> � � LSRs and the process of generating unicast BFD >>>>>> Control >>>>>> packets, as >>>>>> � � noted above.� To mitigate that, a BFD >>>>>> implementation >>>>>> that >>>>>> supports >>>>>> � � this specification is RECOMMENDED to use a rate >>>>>> limiter >>>>>> of received >>>>>> � � BFD Control packets passed to processing in the >>>>>> control >>>>>> plane of the >>>>>> � � ingress LSR. >>>>>> >>>>>> Regards, >>>>>> Greg >>>>>> >>>>>> On Thu, Feb 22, 2024 at 4:10â�¯PM Joel Halpern via >>>>>> Datatracker >>>>>> <noreply@xxxxxxxx> wrote: >>>>>> >>>>>> Reviewer: Joel Halpern >>>>>> Review result: Ready >>>>>> >>>>>> Hello, >>>>>> >>>>>> I have been selected as the Routing Directorate >>>>>> reviewer >>>>>> for this draft. The >>>>>> Routing Directorate seeks to review all routing or >>>>>> routing-related drafts as >>>>>> they pass through IETF last call and IESG review, and >>>>>> sometimes on special >>>>>> request. The purpose of the review is to provide >>>>>> assistance to the Routing ADs. >>>>>> For more information about the Routing Directorate, >>>>>> please see >>>>>> https://wiki.ietf.org/en/group/rtg/RtgDir >>>>>> >>>>>> Although these comments are primarily for the use of >>>>>> the >>>>>> Routing ADs, it would >>>>>> be helpful if you could consider them along with any >>>>>> other IETF Last Call >>>>>> comments that you receive, and strive to resolve them >>>>>> through discussion or by >>>>>> updating the draft. >>>>>> >>>>>> Document: draft-name-version >>>>>> Reviewer: your-name >>>>>> Review Date: date >>>>>> IETF LC End Date: date-if-known >>>>>> Intended Status: copy-from-I-D >>>>>> >>>>>> Summary:� This document is ready for publication as >>>>>> a >>>>>> Proposed Standard. >>>>>> � � I do have one question that I would >>>>>> appreciate being >>>>>> considered. >>>>>> >>>>>> Comments: >>>>>> � � The document is clear and readable, with >>>>>> careful >>>>>> references for those >>>>>> � � needing additional details. >>>>>> >>>>>> Major Issues: None >>>>>> >>>>>> Minor Issues: >>>>>> � � I note that the security considerations >>>>>> (section 6) >>>>>> does refer to >>>>>> � � congestion issues caused by excessive >>>>>> transmission >>>>>> of BFD requests.� � I >>>>>> � � wonder if section 5 ("Operation of Multipoint >>>>>> BFD >>>>>> with Active Tail over >>>>>> � � P2MP MPLS LSP") should include a discussion >>>>>> of the >>>>>> congestion implications >>>>>> � � of multiple tails sending notifications at >>>>>> the rate >>>>>> of 1 per second to the >>>>>> � � head end, particularly if the failure is near >>>>>> the >>>>>> head end.� While I >>>>>> � � suspect that the 1 / second rate is low >>>>>> enough for >>>>>> this to be safe, >>>>>> � � discussion in the document would be helpful. >>>>>> >>>>>> >> > > -- > last-call mailing list > last-call@xxxxxxxx > https://www.ietf.org/mailman/listinfo/last-call > -- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call