The -11 version looks good to me. Over to Mirja ... Thanks, --David > -----Original Message----- > From: Stig Venaas [mailto:stig@xxxxxxxxxx] > Sent: Friday, January 26, 2018 1:04 PM > To: Black, David <david.black@xxxxxxx> > Cc: draft-ietf-pim-source-discovery-bsr.all@xxxxxxxx; Stewart Bryant > <stewart.bryant@xxxxxxxxx>; ietf@xxxxxxxx; pim@xxxxxxxx; tsv-art@xxxxxxxx > Subject: Re: [Tsv-art] Tsvart telechat review of draft-ietf-pim-source- > discovery-bsr-08 > > Oops, you're right. I've made these changes and posted revision 11 > now, so hopefully it is ready for publication. That seems to be the > only discuss. > > Thanks, > Stig > > > On Fri, Jan 26, 2018 at 8:07 AM, Black, David <David.Black@xxxxxxxx> wrote: > > Hi Stig, > > > > This is looking good - the technical issue is resolved, as I agree with the > approach in -10, thanks! > > > > There are a couple of editorial items that need attention: > > > > [1] New text in Section 3.3: > > > > A router MUST NOT originate more than N messages per minute. This > > document does not mandate how this should be implemented, but some > > possible ways could be having a minimal time between each message, > > counting the number of messages originated and resetting the count > > every minute, or using a leaky bucket algorithm. One benefit of > > using a leaky bucket algorithm is that it can handle bursts better. > > The default value of N is 6. The value MUST be configurable. > > Depending on the network one may want to use a low value allowing new > > information to be propagated, but with a large number of routers and > > many updates, the total number of messages might become too large > and > > require too much processing. > > > > "Depending on the network one may want to use a low value allowing new > information to be propagated," > > > > That seems wrong, as a low value of N would hit the messages per minute > limit sooner. > > Would "low" -> "larger" correctly capture the intent? If so: > > > > OLD > > Depending on the network one may want to use a low value allowing new > > information to be propagated, but with a large number of routers and > > many updates, the total number of messages might become too large > and > > require too much processing. > > NEW > > Depending on the network, one may want to use a larger value of N to > favor > > propagation of new information, but with a large number of routers and > > many updates, the total number of messages might become too large > and > > require too much processing. > > > > [2] The first paragraph in Section 4.2 specifies the time periods for GSH > TLVs; text ought to be added there that refers to the new message timing > requirements in Section 3.3 (text quoted in [1] above) to ensure that GSH > implementers clearly understand that those message timing requirements > apply to GSH. One can infer this applicability from the structure of the > document, but I would prefer to directly tell GSH implementers that this is > required. > > > > Many thanks for the productive discussion. Also, Mirja deserves the initial > credit for asking that a closer look be taken at the flooding mechanism. > > > > Thanks, --David > > > > > >> -----Original Message----- > >> From: Stig Venaas [mailto:stig@xxxxxxxxxx] > >> Sent: Thursday, January 25, 2018 6:31 PM > >> To: Black, David <david.black@xxxxxxx> > >> Cc: draft-ietf-pim-source-discovery-bsr.all@xxxxxxxx; Stewart Bryant > >> <stewart.bryant@xxxxxxxxx>; ietf@xxxxxxxx; pim@xxxxxxxx; tsv- > art@xxxxxxxx > >> Subject: Re: [Tsv-art] Tsvart telechat review of draft-ietf-pim-source- > >> discovery-bsr-08 > >> > >> Hi > >> > >> I just posted version 10 which I think should resolve the issues > >> raised in the tsv-art review and the discuss that was raised. The > >> change is mainly to limit how often messages can be originated. It > >> specifies a default of max 6 messages per 60 seconds and 1 second > >> between each message. It also says that the limits must be > >> configurable. Note that I first posted version 9, noticed one small > >> issue and then posted version 10. > >> > >> It's embarrassing that we completely forgot to put such limits in the > >> draft, and I'm grateful for the review allowing us to fix it before > >> publication. > >> > >> Stig > >> > >> > >> On Wed, Jan 24, 2018 at 12:08 PM, Black, David <David.Black@xxxxxxxx> > >> wrote: > >> > One change - the value MUST be configurable. While 6 is a plausible > >> number, it results from our intelligent speculation. If that number is > wrong > >> and causes damage in a frail network, that number has to be changeable > as > >> part of the experiment. The Proposed Standard successor to this > >> forthcoming Experimental RFC would be an appropriate context for a > MUST > >> vs. SHOULD discussion, IMHO. > >> > > >> > I also would specify a minimum time between packets, which also needs > to > >> be configurable. That time doesn't have to be the 10 second value from > RFC > >> 5059, as this draft is doing something different, but a value is needed to > >> prevent sending 6 packets back-to-back to a router that can currently > handle > >> the first 1 or 2 but will drop the rest because of everything else in the > chaos > >> that it's currently dealing with. > >> > > >> > Thanks, --David > >> > > >> > > >> >> -----Original Message----- > >> >> From: Tsv-art [mailto:tsv-art-bounces@xxxxxxxx] On Behalf Of Stig > Venaas > >> >> Sent: Wednesday, January 24, 2018 1:33 PM > >> >> To: Black, David <david.black@xxxxxxx> > >> >> Cc: draft-ietf-pim-source-discovery-bsr.all@xxxxxxxx; Stewart Bryant > >> >> <stewart.bryant@xxxxxxxxx>; ietf@xxxxxxxx; pim@xxxxxxxx; tsv- > >> art@xxxxxxxx > >> >> Subject: Re: [Tsv-art] Tsvart telechat review of draft-ietf-pim-source- > >> >> discovery-bsr-08 > >> >> > >> >> Hi > >> >> > >> >> I agree keeping it simple is good, but I have some concerns about > >> >> requiring a minimal fixed time like 10 seconds in BSR (RFC 5059) > >> >> between each message. I would prefer something like: > >> >> > >> >> A router MUST NOT originate more than N packets per minute, note > that > >> >> this does not consider packets that are being forwarded by the router. > >> >> This document does not mandate how this should be implemented, > but > >> >> some possible ways could be having a minimal time between each > packet, > >> >> counting the number of packets originated and resetting the count > >> >> every minute, or using a leaky bucket algorithm. One benefit of using > >> >> a leaky bucket algorithm is that it can handle bursts better. The > >> >> default value of N is 6. The value SHOULD be configurable. Depending > >> >> on the network one may want to use a low value allowing new > >> >> information to be propagated, but with a large number of routers and > >> >> many updates, the total number of messages might become too large > and > >> >> requiring too much processing. The PFM mechanism can be used to > >> >> distribute many different types of information. When defining new > >> >> types, it should be considered what changes, if any, warrants sending > >> >> a triggered message. > >> >> > >> >> For the GSH (source announcement) TLV, I'll make it clear that a > >> >> triggered message is useful when a new source is detected, but one > >> >> should not trigger a message due to a source expiring (becoming > >> >> inactive). > >> >> > >> >> Thoughts? > >> >> > >> >> Stig > >> >> > >> >> > >> >> On Wed, Jan 24, 2018 at 9:40 AM, Black, David <David.Black@xxxxxxxx> > >> >> wrote: > >> >> > That works for me, Thanks, --David > >> >> > > >> >> > > >> >> >> -----Original Message----- > >> >> >> From: Stewart Bryant [mailto:stewart.bryant@xxxxxxxxx] > >> >> >> Sent: Wednesday, January 24, 2018 11:45 AM > >> >> >> To: Black, David <david.black@xxxxxxx>; Stig Venaas > >> >> <stig@xxxxxxxxxx> > >> >> >> Cc: tsv-art@xxxxxxxx; ietf@xxxxxxxx; pim@xxxxxxxx; draft-ietf-pim- > >> source- > >> >> >> discovery-bsr.all@xxxxxxxx > >> >> >> Subject: Re: Tsvart telechat review of draft-ietf-pim-source- > discovery- > >> >> bsr-08 > >> >> >> > >> >> >> The problem with complex processing under error conditions is that > >> that > >> >> >> is where all the software bugs hang out because they are hard to > test > >> >> >> and don't show up until you have the problem they are trying to fix. > >> >> >> > >> >> >> This is a case where you want the simplest possible process like a > small > >> >> >> burst followed by your 60s interval which seems unlikely to stress > any > >> >> >> sensibly designed implementation on a reasonably sized network. > >> >> >> > >> >> >> - Stewart > >> >> >> > >> >> >> > >> >> >> On 24/01/2018 16:30, Black, David wrote: > >> >> >> > Hi Stig, > >> >> >> > > >> >> >> >> I agree with all you wrote and will update the document. > However, > >> >> >> >> there is one slight issue with the minimum time between > >> origination of > >> >> >> >> each message. When a new source is detected, we would like to > >> >> >> >> originate a message ASAP so that receivers can start receiving > the > >> >> >> >> multicast without much delay. A 10s delay would be a rather long > >> time > >> >> >> >> if a source was detected right after the previous message was > >> >> >> >> originated. I think some delay would be warranted though, in > >> >> >> >> particular in a case where perhaps a router starts up and a large > >> >> >> >> number of directly connected sources could be detected within > a > >> short > >> >> >> >> time frame. I think an exponential back-off could make sense > here. > >> >> >> >> E.g., if it is just one new source, maybe trigger a message ASAP. > If a > >> >> >> >> new source is detected right after the previous one, wait a bit > >> >> >> >> longer, which also allows for aggregation of multiple sources in > one > >> >> >> >> messages if several are detected later. In extreme cases one > could > >> >> >> >> over time keep increasing the delay until the next update. > >> >> >> >> If sufficient we could maybe have a fixed minimum delay of 1s or > >> not, > >> >> >> >> but that is probably too short in those extreme cases. Hence > maybe > >> an > >> >> >> >> exponential back-off. > >> >> >> > Exponential back-off sounds like a very good idea - I'd suggest > adding > >> >> >> something starting from RFC 5059's back-off functionality. > >> >> >> > > >> >> >> >> I would appreciate some further guidance what you think is > >> reasonable > >> >> >> >> here, and perhaps whether I can borrow something here from > >> other > >> >> >> >> protocols/drafts. Part of the experiment here might be to find > out > >> >> >> >> what minimum values, or how rapid back-off, is needed based > on > >> the > >> >> >> >> size of the network, the amount of sources, the types of links > etc. > >> >> >> > In addition to burst scenarios (e.g., router starts up, lots of new > >> sources > >> >> >> detected quickly as a result), I strongly suggest thinking about chaos > >> >> >> scenarios where links and/or routers are coming and going so > rapidly > >> that > >> >> the > >> >> >> source population is in a constant state of flux. If things are really > bad, > >> >> the > >> >> >> best thing to do may be to shut up and hope that the chaos settles > out, > >> as > >> >> >> not much useful will happen until it does, and send messages about > >> >> >> observed changes risks make things worse. Again, exponential > back- > >> off > >> >> >> makes sense, possibly quite aggressive, e.g., back-off from 10 > seconds > >> by > >> >> a > >> >> >> small factor a few times, and if things still look bad, wait at least a > >> minute > >> >> or > >> >> >> two with further back-off from that longer time until things > stabilize. > >> This > >> >> >> needs more thought on how to adjust the back-off factor, as that > off- > >> the- > >> >> >> top-of my-head example probably exhibits peculiar behavior in > >> scenarios > >> >> >> that just are on the edge of tripping the long delay - some thinking > >> about > >> >> >> what stability means and how to get there may help in figuring out > the > >> >> >> relative merits and applicability of backing off further vs. some kind > of > >> >> >> dramatic reset, analogous to TCP's congestion window reset on > >> timeout. > >> >> >> > > >> >> >> > As this is intended to be an experimental RFC, I don’t think a > >> completely > >> >> >> worked-out solution is expected or required - a good discussion of > the > >> >> >> problems and explanation of areas that need investigation as part > of > >> the > >> >> >> experiment ought to suffice, as suggested in last sentence quoted > >> above. > >> >> I > >> >> >> would add some initial exponential back-off functionality as a > starting > >> >> point. > >> >> >> > > >> >> >> >> Also note that the general mechanism can be used for many > types > >> of > >> >> >> >> information. It depends on the information how urgent it is to > >> >> >> >> distribute it. Source discovery is particular is fairly urgent. > >> >> >> > And that should be discussed, perhaps in Section 3 somewhere. > >> >> >> > > >> >> >> > Thanks, --David > >> >> >> > > >> >> >> > > >> >> >> >> -----Original Message----- > >> >> >> >> From: Stig Venaas [mailto:stig@xxxxxxxxxx] > >> >> >> >> Sent: Tuesday, January 23, 2018 7:44 PM > >> >> >> >> To: Black, David <david.black@xxxxxxx> > >> >> >> >> Cc: tsv-art@xxxxxxxx; draft-ietf-pim-source-discovery- > >> bsr.all@xxxxxxxx; > >> >> >> >> ietf@xxxxxxxx; pim@xxxxxxxx > >> >> >> >> Subject: Re: Tsvart telechat review of draft-ietf-pim-source- > >> discovery- > >> >> >> bsr-08 > >> >> >> >> > >> >> >> >> Hi, thanks for the great comments. > >> >> >> >> > >> >> >> >> I agree with all you wrote and will update the document. > However, > >> >> >> >> there is one slight issue with the minimum time between > >> origination of > >> >> >> >> each message. When a new source is detected, we would like to > >> >> >> >> originate a message ASAP so that receivers can start receiving > the > >> >> >> >> multicast without much delay. A 10s delay would be a rather long > >> time > >> >> >> >> if a source was detected right after the previous message was > >> >> >> >> originated. I think some delay would be warranted though, in > >> >> >> >> particular in a case where perhaps a router starts up and a large > >> >> >> >> number of directly connected sources could be detected within > a > >> short > >> >> >> >> time frame. I think an exponential back-off could make sense > here. > >> >> >> >> E.g., if it is just one new source, maybe trigger a message ASAP. > If a > >> >> >> >> new source is detected right after the previous one, wait a bit > >> >> >> >> longer, which also allows for aggregation of multiple sources in > one > >> >> >> >> messages if several are detected later. In extreme cases one > could > >> >> >> >> over time keep increasing the delay until the next update. > >> >> >> >> If sufficient we could maybe have a fixed minimum delay of 1s or > >> not, > >> >> >> >> but that is probably too short in those extreme cases. Hence > maybe > >> an > >> >> >> >> exponential back-off. > >> >> >> >> > >> >> >> >> I would appreciate some further guidance what you think is > >> reasonable > >> >> >> >> here, and perhaps whether I can borrow something here from > >> other > >> >> >> >> protocols/drafts. Part of the experiment here might be to find > out > >> >> >> >> what minimum values, or how rapid back-off, is needed based > on > >> the > >> >> >> >> size of the network, the amount of sources, the types of links > etc. > >> >> >> >> > >> >> >> >> Also note that the general mechanism can be used for many > types > >> of > >> >> >> >> information. It depends on the information how urgent it is to > >> >> >> >> distribute it. Source discovery is particular is fairly urgent. > >> >> >> >> > >> >> >> >> Stig > >> >> >> >> > >> >> >> >> > >> >> >> >> On Tue, Jan 23, 2018 at 3:40 PM, David Black > >> <david.black@xxxxxxxx> > >> >> >> wrote: > >> >> >> >>> Reviewer: David Black > >> >> >> >>> Review result: Ready with Issues > >> >> >> >>> > >> >> >> >>> I've reviewed this document as part of TSV-ART's ongoing > effort to > >> >> >> review key > >> >> >> >>> IETF documents. These comments were written primarily for > the > >> >> >> transport area > >> >> >> >>> directors, but are copied to the document's authors for their > >> >> information > >> >> >> and > >> >> >> >>> to allow them to address any issues raised. When done at the > >> time of > >> >> >> IETF Last > >> >> >> >>> Call, the authors should consider this review together with any > >> other > >> >> >> last-call > >> >> >> >>> comments they receive. Please always CC tsv-art@xxxxxxxx if > you > >> >> reply to > >> >> >> or > >> >> >> >>> forward this review. > >> >> >> >>> > >> >> >> >>> This draft describes an experimental PFM (PIM Flooding > >> Mechanism) > >> >> >> mechanism for > >> >> >> >>> flooding PIM information among multicast routers that is a > >> >> generalized > >> >> >> form of > >> >> >> >>> the RFC 5059 PIM BSR (BootStrap Router) mechanism, and > applies > >> >> this > >> >> >> mechanism > >> >> >> >>> to distribution of source group mappings (PFM-SD). > >> >> >> >>> > >> >> >> >>> Early implementation experience with PFM-SD on low > bandwidth > >> >> radio > >> >> >> links > >> >> >> >>> (described Section 2) suggests that the mechanism is able to > work > >> >> better > >> >> >> than > >> >> >> >>> PIM-SM without starving other traffic in the fashion that PIM- > DM > >> >> may. > >> >> >> This is > >> >> >> >>> promising and (in this reviewer's opinion) justifies > >> experimentation at > >> >> >> larger > >> >> >> >>> scale and in other network environments. In general, this is a > >> well- > >> >> >> written > >> >> >> >>> document and the authors should be commended for including > >> the > >> >> >> "running code" > >> >> >> >>> implementation experience report in Section 2. > >> >> >> >>> > >> >> >> >>> Flooding mechanisms are very useful, but the time periods that > >> >> govern > >> >> >> sending > >> >> >> >>> of flooding messages are crucial to avoid excessive > consumption > >> of > >> >> >> network > >> >> >> >>> resources. Section 5 of RFC 5059 has a solid discussion of the > time > >> >> >> periods > >> >> >> >>> that apply to use of flooding by the BSR mechanism. The > >> discussion > >> >> in > >> >> >> this > >> >> >> >>> draft is somewhat weaker, raising a couple of minor issues: > >> >> >> >>> > >> >> >> >>> 1) For PFM-SD, Section 4.2 provides a reasonable discussion of > >> time > >> >> >> periods > >> >> >> >>> that apply, but appears to be missing a minimum time period > >> >> between > >> >> >> sending > >> >> >> >>> messages. Section 5 of RFC 5059 recommends a default of 10 > >> >> seconds > >> >> >> for that > >> >> >> >>> minimum time period by comparison to a default PIM BSR > sending > >> >> >> interval of 60 > >> >> >> >>> seconds. That 10 second minimum default should be added to > this > >> >> draft, > >> >> >> as the > >> >> >> >>> same default sending interval of 60 seconds is used. > >> >> >> >>> > >> >> >> >>> 2) For future use of PFM for other purposes, Section 3.3 > provides > >> the > >> >> >> following > >> >> >> >>> guidance: > >> >> >> >>> > >> >> >> >>> Each TLV definition will need to define when a triggered PFM > >> >> message > >> >> >> needs > >> >> >> >>> to be originated, and also whether to send periodic > messages, > >> and > >> >> >> how > >> >> >> >>> frequent. > >> >> >> >>> > >> >> >> >>> That guidance is correct as far as it goes, but it's not particularly > >> >> helpful > >> >> >> >>> to future protocol designers. Text should be added to at least > >> point > >> >> to > >> >> >> the > >> >> >> >>> examples in section 4.2 of this draft and/or part of Section 5 of > RFC > >> >> 5059 > >> >> >> to > >> >> >> >>> suggest the sorts of values that have proven to be workable, > and > >> >> >> perhaps also > >> >> >> >>> strongly encourage (SHOULD use) a default minimum time > >> between > >> >> >> messages of at > >> >> >> >>> least 10 seconds. > >> >> >> >>> > >> >> >> >>> Understanding this draft requires that the reader be familiar > with > >> >> >> multicast > >> >> >> >>> and PIM, which is reasonable. In addition, an understanding of > >> PIM > >> >> BSR > >> >> >> is also > >> >> >> >>> required, which is perhaps somewhat less reasonable. An > >> example > >> >> that > >> >> >> this > >> >> >> >>> reviewer tripped over is that Section 3 of this draft states that > >> "Like > >> >> BSR, > >> >> >> >>> messages are forwarded hop by hop." There is no further > >> >> explanation > >> >> >> or > >> >> >> >>> definition of "forwarded hop by hop," making it necessary to > >> consult > >> >> RFC > >> >> >> 5059 > >> >> >> >>> to understand that term, e.g., this has nothing to do with IPv6 > >> hop- > >> >> by- > >> >> >> hop > >> >> >> >>> options. A sentence or two of explanation of this hop by hop > >> >> forwarding > >> >> >> >>> concept ought to be copied and adapted from RFC 5059, and it > >> would > >> >> be > >> >> >> good to > >> >> >> >>> check for other concepts that rely on RFC 5059 for definitions. > >> >> >> >>> > >> >> >> >>> > >> >> >> > >> >> > > >> >> > >> >> _______________________________________________ > >> >> Tsv-art mailing list > >> >> Tsv-art@xxxxxxxx > >> >> https://www.ietf.org/mailman/listinfo/tsv-art