Ib my experience (and I did charter the PIM and MPLS working groups in
ancient history), the distinction between pt2mp for MPLS and IP
Multicast are not about scale. I do not think any of the defining
documents deal with what scale they aim at. IP Multicast includes SSM,
which is pt-2-mpt, and ASM, which is mp-2-mp.
I actually tend to doubt that the BFD error indications will cause data
plane congestion, given the relative rates and expected scales. But it
is our job to say so. Then other folks can decide wheether we have
sufficiently addressed the questions.
Yours,
Joel
On 2/24/2024 10:56 PM, loa@xxxxx wrote:
Greg, Joel, all
Traditionally we have distinguished between "p2mp" for MPLS, and
"multicast" for IP. An IP multicast service might easily reach a "large
number of leaves", while MPLS p2mp is more of an "transport" service
where the number of leaves are moderate.
I'm not saying that that "moderate number" might not cause the problems
Greg and Joel discusses, but it might be an idea to think a bit about
the scale. How many leaves is required to cause:
- data plane congestion?
- control plane overload?
Currently I don't see any data plane problems (correct me if I'm wrong),
while control plane overload is a possibility.
/Loa
Mostly. THere is one other aspect. You may consider it irrelevant, in
which case we can simply say so. Can the inbound notifications coming
from a large number of leaves at the same time cause data plane
congestion?
Yours,
Joel
On 2/24/2024 8:44 PM, Greg Mirsky wrote:
Hi Joel,
thank you for your quick response. I consider two risks that may
stress the root's control plane:
* notifications transmitted by the leaves reporting the failure of
the p2mp LSP
* notifications transmitted by the root to every leave closing the
Poll sequence
As I understand it, you refer to the former as inbound congestion. The
latter - outbound. Is that correct? I agree that even the inbound
stream of notifications may overload the root's control plane. And the
outbound process further increases the probability of the congestion
in the control plane. My proposal is to apply a rate limiter to
control inbound flow of BFD Control messages punted to the control
plane.
What would you suggest in addition to the proposed text?
Best regards,
Greg
On Sat, Feb 24, 2024 at 3:28 PM Joel Halpern
<jmh.direct@xxxxxxxxxxxxxxx> wrote:
What you say makes sense. I think we need to acknowledge the
inbound congestion risk, even if we choose not to try to
ameliorate it. Your approaches seems to address the outbound
congestion risk from the root.
YOurs,
Joel
On 2/24/2024 6:25 PM, Greg Mirsky wrote:
Hi Joel,
thank you for the clarification. My idea is to use a rate limiter
at the root of the p2mp LSP that may receive notifications from
the leaves affected by the failure. I imagine that the threshold
of the rate limiter might be exceeded and the notifications will
be discarded. As a result, some notifications will be processed
by the headend of the p2mp BFD session later, as the tails
transmit notifications periodically until the receive the BFD
Control message with the Final flag set. Thus, we cannot avoid
the congestion but mitigate the negative effect it might cause by
extending the convergence. Does that make sense?
Regards,
Greg
On Sat, Feb 24, 2024 at 2:39 PM Joel Halpern
<jmh@xxxxxxxxxxxxxxx> wrote:
That covers part of my concern. But.... A failure near the
root means that a lot of leaves will see failure, and they
will all send notifications converging on the root. Those
notifications themselves, not just the final messages, seem
able to cause congestion. I am not sure what can be done
about it, but we aren't allowed to ignore it.
Yours,
Joel
On 2/24/2024 3:34 PM, Greg Mirsky wrote:
Hi Joel,
thank you for your support of this work and the suggestion.
Would the following update of the last paragraph of Section
5 help:
OLD TEXT:
  An ingress LSR that has received the BFD Control packet,
as described
  above, sends the unicast IP/UDP encapsulated BFD Control
packet with
  the Final (F) bit set to the egress LSR.
NEW TEXT:
  As described above, an ingress LSR that has received the
BFD Control
  packet sends the unicast IP/UDP encapsulated BFD Control
packet with
  the Final (F) bit set to the egress LSR. In some
scenarios, e.g.,
  when a p2mp LSP is broken close to its root, and the
number of egress
  LSRs is significantly large, the control plane of the
ingress LSR
  might be congested by the BFD Control packets transmitted
by egress
  LSRs and the process of generating unicast BFD Control
packets, as
  noted above. To mitigate that, a BFD implementation
that
supports
  this specification is RECOMMENDED to use a rate limiter
of received
  BFD Control packets passed to processing in the control
plane of the
  ingress LSR.
Regards,
Greg
On Thu, Feb 22, 2024 at 4:10 PM Joel Halpern via Datatracker
<noreply@xxxxxxxx> wrote:
Reviewer: Joel Halpern
Review result: Ready
Hello,
I have been selected as the Routing Directorate reviewer
for this draft. The
Routing Directorate seeks to review all routing or
routing-related drafts as
they pass through IETF last call and IESG review, and
sometimes on special
request. The purpose of the review is to provide
assistance to the Routing ADs.
For more information about the Routing Directorate,
please see
https://wiki.ietf.org/en/group/rtg/RtgDir
Although these comments are primarily for the use of the
Routing ADs, it would
be helpful if you could consider them along with any
other IETF Last Call
comments that you receive, and strive to resolve them
through discussion or by
updating the draft.
Document: draft-name-version
Reviewer: your-name
Review Date: date
IETF LC End Date: date-if-known
Intended Status: copy-from-I-D
Summary:Â This document is ready for publication as a
Proposed Standard.
  I do have one question that I would appreciate being
considered.
Comments:
  The document is clear and readable, with careful
references for those
  needing additional details.
Major Issues: None
Minor Issues:
  I note that the security considerations (section 6)
does refer to
  congestion issues caused by excessive transmission
of BFD requests.  I
  wonder if section 5 ("Operation of Multipoint BFD
with Active Tail over
  P2MP MPLS LSP") should include a discussion of the
congestion implications
  of multiple tails sending notifications at the rate
of 1 per second to the
  head end, particularly if the failure is near the
head end. While I
  suspect that the 1 / second rate is low enough for
this to be safe,
  discussion in the document would be helpful.
--
last-call mailing list
last-call@xxxxxxxx
https://www.ietf.org/mailman/listinfo/last-call