Hi Robert, Thanks again for the prompt responses. >Well, those are just a subset of the things that could change in command's >context that would cause the command to be erroneous or even damaging if >it were run, and you're not addressing the other security issues that come >with very long scheduling (overflowing buffers, or having lots of time to >schedule a massive number of commands to all try to happen at once). I >suspect there are other things that pressured adding the "near future" >restriction that haven't been captured well yet. Well, the thing is that 15 seconds (or 'a few seconds' for that matter) is a long enough time to send thousands (or more) of scheduled RPCs, so I am not sure the sched-max-future mitigates the buffer overflow threat. Generally speaking, Section 3.6 discusses erroneous scenarios, and not security threats. I would suggest to add some text to the security considerations section, which discusses the overflow attack you mentioned here. Would this address your concern? >I think you're saying that in production deployments today, the >authorization policy is "the peer was able to send me a packet". Is that >wrong? I can't comment about what is deployed in production today, although I am sure there are operators out there who can comment about that. RFC 6536, which defines a NETCONF access control model, is cited by 6 other RFCs, so I do not think access control has been overlooked by the community. Nevertheless, I believe that (much like RFC 6241) the access control specifics are not within the scope of the current draft. Thanks, Tal. >-----Original Message----- >From: Robert Sparks [mailto:rjsparks@xxxxxxxxxxx] >Sent: Tuesday, August 04, 2015 9:01 PM >To: Tal Mizrahi >Cc: ietf@xxxxxxxx; General Area Review Team; draft-mm-netconf-time- >capability.all@xxxxxxxx >Subject: Re: Gen-art LC review: draft-mm-netconf-time-capability-05 > > > >On 8/4/15 11:19 AM, Tal Mizrahi wrote: >> Hi Robert, >> >> Thanks for the comments. >> >> >>>> A typical example of using near-future scheduling is a coordinated >>>> commit; a client needs to trigger a commit at n servers, so that the >>>> n servers perform the commit as close as possible to simultaneously. >>>> Without the time capability, the client sends a sequence of n commit >>>> messages, and thus each server performs the commit at a different >>>> time. By using the time capability, the client can send commit >>>> messages that are scheduled to take place at time Ts, which is 5 >>>> seconds in the future, causing the servers to invoke the commit as close >as possible to time Ts. >>> I'm interested in your response to Andy's point on this paragraph. >> Okay, so here is Andy's point: >> >>>> You should pick a different example because the NETCONF >>>> confirmed-commit procedure is designed to be loose-coupled. The >default timeout is 10 minutes. >>>> Since the client needs sessions open with all servers involved in >>>> the network-wide commit, there is no advantage in staging the >>>> <commit> operations 15 sec. in advance, to make sure the servers are >reachable. >> And here is our response from 02-Aug-2015: >> >>> Right, confirmed-commit is loose-coupled. But the example quoted >>> above (Example >>> 1 in the draft) is not intended to replace the confirmed commit. The >>> purpose in this example is different: the client wants the commit >>> RPCs to be executed at the same time in all servers. >>> The confirmed-commit serves a different purpose, which is to make >>> sure that everyone either commits or rolls back. BTW, a confirmed >>> commit can be sent with the scheduled-time element, allowing to enjoy >the best of both worlds. >> >> Please let us know if you have further concerns about this point. >> >> >>>> The default value of sched-max-future is defined to be 15 seconds. >>>> This duration is long enough to allow the scheduled RPC to be sent >>>> by the client, potentially to multiple servers, and in some cases to >>>> send a cancellation message, as described in Section 3.2. On the >>>> other hand, the 15 second duration yields a very low probability of a >reboot or a permission change. >>> I'm not finding the explanation terribly persuasive, but it's at >>> least _some_ explanation - thanks for that. I'll leave it to the ADs >>> and other reviewers in the field to see if it's sufficient for an >>> experimental protocol. >> (*) Please see comment (**) below. >> >>>> Note that we did not define a maximal value for sched-max-future, >>>> since one of the goals was to define a generic tool that can be used >>>> for various different environments. The draft clearly states the >>>> intention of using near-future-scheduling, but the requirements and >>>> constraints of different environments may require the >>>> sched-max-future to have a different value, potentially higher than >>>> 30 seconds. Hence, we prefer not to define a maximal value. Indeed, in >the draft 06 there is a more detailed discussion about the issues we are trying >to prevent by using near-future scheduling (Section 3.6). >>> Without a maximal value, I think you need more of a discussion >>> guiding the choice of sched-max-future. Otherwise, you are just >>> waiving your hands at not addressing the problems with far-future >>> scheduling, and potentially well-meaning but uninformed people are >>> going to go step in them anyway. There was a point to choosing the near- >future limit. >>> Enforce it or explain it with more vigor please. >> (**) Your point is well taken. What we suggest, regarding this point and the >previous point (*), is that we add more text explaining the factors that affect >sched-max-future to Section 3.6 . >> >> Here is the new text we suggest. Please let us know if this addresses your >comment: >> >> >> The challenge in far future scheduling is that during the long period between >the time at which the RPC is sent and the time at which it is scheduled to be >executed the following erroneous events may occur: >> - The server may restart. >> - The client's authorization level may be changed. >> - The client may restart and send a conflicting RPC. >> - A different client may send a conflicting RPC. >Well, those are just a subset of the things that could change in command's >context that would cause the command to be erroneous or even damaging if >it were run, and you're not addressing the other security issues that come >with very long scheduling (overflowing buffers, or having lots of time to >schedule a massive number of commands to all try to happen at once). I >suspect there are other things that pressured adding the "near future" >restriction that haven't been captured well yet. >> >> In these cases if the server performs the scheduled operation it may >perform an action that is inconsistent with the current network policy, or >inconsistent with the currently active clients. >> >> Near future scheduling guarantees that external events such as the >examples above have a low probability of occurring during the sched-max- >future period, and even when they do, the period of inconsistency is limited >to sched-max-future, which is a short period of time. >> >> Hence, sched-max-future should be configured to a value that is high >enough to allow the client to: >> 1. Send the scheduled RPC, potentially to multiple servers. >> 2. Receive notifications or rpc-error messages from the server(s), or wait for >a timeout and decide that if no response has arrive then something is wrong. >> 3. If necessary, send a cancellation message, potentially to multiple servers. >> >> On the other hand, sched-max-future should be configured to a value that is >low enough to allow a low probability of the erroneous events above, typically >on the order of a few seconds. Note that even if sched-max-future is >configured to a low value, it is still possible (with a low probability) that an >erroneous event will occur. However, this short potentially hazardous period >is not significantly worse than in conventional (unscheduled) RPCs, as even a >conventional RPC may in some cases be executed a few seconds after it was >sent by the client. >> >> The default value of sched-max-future is defined to be 15 seconds. This >duration is long enough to allow the scheduled RPC to be sent by the client, >potentially to multiple servers, and in some cases to send a cancellation >message, as described in Section 3.2. On the other hand, the 15 second >duration yields a very low probability of a reboot or a permission change. >I still think, especially while this as at experimental, you should scope this with >an absolute max. But I'm just one reviewer. Work it out with your AD. > >> >> >>>> This YANG module defines the <cancel-schedule> RPC. This RPC may >>>> be considered sensitive or vulnerable in some network environments. >>>> Since the value of the <schedule-id> is known to all the clients that are >>>> subscribed to notifications from the server, the <cancel-schedule> RPC >>>> may be used maliciously to attack servers by canceling their pending >RPCs. >>>> This attack is addressed in two layers: (i) security at the transport layer, >>>> limiting the attack only to clients that have successfully initiated a secure >>>> session with the server, and (ii) the authorization level required to cancel >>>> an RPC should be the same as the level required to schedule it. >>> To help me along, point me to the specifics of what you use to set and >>> verify such an authorization level? >> Indeed, there is a need for an authorization scheme, which is able to set and >verify the authorization level. >> NETCONF (RFC 6241) does not explicitly define an authorization scheme, and >it is probably not within the scope of the current draft to define such a >scheme either. >> Quoting RFC 6241: >> >> This document does not specify an authorization scheme, as such a >> scheme will likely be tied to a meta-data model or a data model. >> Implementors SHOULD provide a comprehensive authorization scheme >with >> NETCONF. >> ... >> Different environments may well allow different rights prior to and >> then after authentication. Thus, an authorization model is not >> specified in this document. When an operation is not properly >> authorized, a simple "access denied" is sufficient. >I think you're saying that in production deployments today, the >authorization policy is "the peer was able to send me a packet". Is that >wrong? >> >> >> >> Please let us know if you have further comments or concerns about any of >the issues above. >> >> Thanks, >> Tal.