Dale, thank you for your extensive review. I have entered a No Objection ballot for this document based on the follow-up discussion. Lars > On 2021-3-23, at 2:42, Dale Worley via Datatracker <noreply@xxxxxxxx> wrote: > > Reviewer: Dale Worley > Review result: Ready with Issues > > I am the assigned Gen-ART reviewer for this draft. The General Area > Review Team (Gen-ART) reviews all IETF documents being processed > by the IESG for the IETF Chair. Please treat these comments just > like any other last call comments. > > For more information, please see the FAQ at > > <https://trac.ietf.org/trac/gen/wiki/GenArtfaq>. > > Document: draft-ietf-dots-rfc8782-bis-05 > Reviewer: Dale R. Worley > Review Date: 2021-03-22 > IETF LC End Date: 2021-03-22 > IESG Telechat date: unknown > > Summary: > > This draft is on the right track but has open issues, described in > the review. > > I've provided a long list of minor editorial issues, and a short list > of technical issues. I suspect that the technical issues have been > resolved in the practices of the community and that their apparent > status as problems stems from not getting the wording properly > aligned with practice. > > Major issues: > > The condition of two DOTS mitigation requests overlapping depends on > addresses (and alternatives to them) but as defined in section 4.4.1, > does NOT depend on port numbers. However, other parts of the text > seem to presume that port numbers are involved in testing for > overlapping. The correct choice needs to be established and the text > made consistent. > > Does the requesting of a mitigation only withdraw overlapping > mitigations that were requested using the same signal channel, or is > the effect global? If a mitigation request with trigger-mitigation = > false is activated by ending of a signal channel, does reestablishing > the channel withdraw it? (Naively I thought it would, but that isn't > stated.) If so, how are the former and the current signal channels > correlated, given that cuid collisions can prevent them from using the > same common identifiers? Indeed, the text does not make it clear how > a mitigation that is triggered by the ending of a signal channel can > be withdrawn, other than by the expiration of its timer. > > Minor issues: > > The 4.09 response is used to report cuid conflicts, but also various > other conflicts. Given that cuid conflicts require specific > processing, and can happen when other conflicts could also be > reported, it seems to me that for cuid conflicts, you want that the > response MUST include conflict-information. > > In section 4.4.1 there is a discussion of a configuration where a > client communicates through two different gateways to one same server > using a different certificate to communicate with each gateway. The > text discusses a configuration where we want the two transaction > streams to be treated as one by the client and server. It seems to me > that this is an unusual situation which can only succeed if both the > client and server have specific configuration for it. As a > consequence, the situation doesn't need to be discussed in this > document. Conversely, the default result of this topology is that the > client and server treat both transactions streams separately (and > perhaps neither of them is aware of the overall topology). It seems > like this case should work correctly without any special > considerations, and so does not need to be documented specifically, > either. > > The overall framework for signal channel configuration is not clear. > By default, I assume that the client sets the channel configuration, > constrained by the limits on parameter values imposed by the server, > and that these values apply to communication in both directions (when > applicable). The text in 4.5 and 4.5.1 is consistent with this model. > However text in 4.5.2 talks about "agents" changing configuration > values, which implies it's possible for the server to set channel > configuration. There is discussion in section 4.5.3 of a server > sending "a validity time with a configuration it sends", which makes > no sense if only the client can change the configuration -- the > configuration won't change until the client changes it. Also "the > update of the configuration data if a change occurs at the DOTS server > side". The model needs to be established, and the text aligned with > it. > > Nits/editorial comments: > > Global editorial issues: > > There is a lot of special terminology, and it would help if > definitions were gathered in section 2. Additionally, this would help > reveal where the text uses undefined synonyms of defined terms, > several cases of which I have spotted. > > There are issues involving "Observe". One is at the start of section > 4.4, where the text refers to "subscribe", but that is not the term > used in CoAP, indeed CoAP deliberately avoids that term. Also, unless > one is familiar with CoAP, one thinks GET has no side-effects, and > thus cannot possibly establish a subscription. There are related > issues in sections 4.4.2.1 and 4.4.2.2 that left me wondering for > which GET requests Observe was mandatory and/or permitted and what > values (0 and/or 1) were permitted. I think it would help to start > 4.4.2.1 with an overview discussion of the permitted/required uses of > Observe in DOTS GET requests. > > It would help to have adjectives for a mitigation request with > trigger-mitigation = false, and for a mitigation request with > trigger-mitigation = true. > > It seems that "deactivating" a mitigation request is used as an > undefined synonym of "withdrawing" it, but (on my first two reads), I > thought it meant "delete". At this point, I suspect that the words > hide complexity which has not been made explicit: the client > "requests" a mitigation with trigger-mitigation = false, but the loss > of the channel "activates" it. Worse, "activation" causes the actions > that are described as being caused by "requesting" a mitigation with > trigger-mitigation = true. A description of the states, the > transitions between them, and the verbs to describe them should be > given, perhaps in section 2. > > Section 4.4.1 is 16 pages long and really should be cut into a number > of subsections. > > Section 4.4.1 contains two parallel but different > definitions/discussions of conflict-information. Not being in a > position to print the document, I can't quite make out what is going > on, but I suspect some reorganization of the section is in order to > replace the two partial definitions with one complete one. (This > might be connected with the entries in section 9 and/or section 10.3.) > The two parallel definitions are partially excerpted below, and both > have the problem that the contextual text says that the response will > include "enough information for a DOTS client to recognize ...", but > the definition of conflict-information states that it is optional: > > ----- > > The response includes enough information for a DOTS client to > recognize the source of the conflict as described below in the > 'conflict-information' subtree with only the relevant nodes listed: > > conflict-information: Indicates that a mitigation request is > conflicting with another mitigation request. This optional > attribute has the following structure: > > ----- > > For both 2.01 > (Created) and 4.09 (Conflict) responses, the response includes enough > information for a DOTS client to recognize the source of the conflict > as described below: > > conflict-information: Indicates that a mitigation request is > conflicting with another mitigation request(s) from other DOTS > client(s). This optional attribute has the following structure: > > ----- > > Detailed editorial issues: > > (Note that some of these are summarized in a clearer way above.) > > 1. Introduction > > The example of Figure 1 is introduced by this paragraph: > > An example of a network diagram that illustrates a deployment of DOTS > agents is shown in Figure 1. In this example, a DOTS server is > operating on the access network. A DOTS client is located on the LAN > (Local Area Network), while a DOTS gateway is embedded in the CPE > (Customer Premises Equipment). > > But the example also includes a DOTS gateway, and would have been > clearer to me if the statement introducing DOTS gateways was made > before the start of the example rather than after it: > > The DOTS client can > communicate directly with a DOTS server or indirectly via a DOTS > gateway. > > 3. Design Overview > > support for asynchronous Non-confirmable messaging > > It might be worth noting here or in section 2 that "Non-confirmable" > (and "Confirmable") are CoAP technical terms. > > Absent such mutual agreement, the DOTS > signal channel MUST run over port number 4646 as defined in > Section 10.1, for both UDP and TCP. > > It might be worth stating this port number is for both the client and > the server to use (or that 4646 is just the listening port for > servers). > > Also, the DOTS server may rely on the signal > channel session loss to trigger mitigation for preconfigured > mitigation requests (if any). > > This doesn't carry quite the right idea. What is really going on is > that the DOTS client may configure mitigation requests that will be > automatically acted upon by the server if the signal channel session > is lost. This is a required facility of the server, but it may be > relied upon by the client. > > DOTS signaling can happen with DTLS over UDP and TLS over TCP. > > s/can happen/can use/ or perhaps "can happen over". > > In deployments where multiple DOTS clients are enabled in a network > (owned and operated by the same entity) ... > > I think you want something like "In deployments with multiple DOTS > clients in a single network and administrative domain ...". > > o Port Control Protocol (PCP) [RFC6887] or Session Traversal > Utilities for NAT (STUN) [RFC8489] may be used to retrieve the > external addresses/prefixes and/or port numbers. > > Would be clearer if it is "may be used by the client to retrieve ...", > as the preceding paragraph is about the translator and here we are > talking about the client without explicitly mentioning it. > > 4.4. DOTS Mitigation Methods > > GET: DOTS clients may use the GET method to subscribe to DOTS > server status messages or to retrieve the list of its > mitigations maintained by a DOTS server (Section 4.4.2). > > Unless one is aware of the "Observe" option of CoAP, using GET to > establish a subscription seems impossible, as it is a side-effect. > The reader could be warned by wording like: > > GET: DOTS clients may use the GET method to retrieve the list > of its mitigations maintained by a DOTS server (Section > 4.4.2), or (using the CoAP Observe option [RFC7641]) to > subscribe to DOTS server status messages. > > -- > > Mitigation requests MUST NOT be delayed > because of checks on probing rate (Section 4.7 of [RFC7252]). > > How does this sentence connect with the preceding sentences of the > paragraph? Also, what does "probing" refer to? I suspect you mean > that mitigation requests can be Non-confirmable and would by default > fall under the rules of the preceding sentences, but you don't want > that. So the sentence could be clarified as "However, mitigation > requests MUST NOT be delayed by these limitations." > > 4.4.1. Request Mitigation > > with the trailing "=" removed from the encoding > > Should be 'the trailing two "="', 'the trailing "="s', or similar, > since the base64 encoding of a string of 16 bytes will always end in > two "=". > > DOTS servers MUST return 4.09 (Conflict) error code to a DOTS > peer to notify that the 'cuid' is already in use by another > DOTS client. > > The error code 4.09 has other defined uses in the signal channel. > Given the special and "global" action needed based on this error code, > there must be an unambiguous way for the client to identify cuid > collision. Unfortunately, there is no "session initiation handshake" > message for which a 4.09 response would be unambiguous. It seems like > the best choice is to look for conflict-information in the response, > since it has a conflict-cause value "CUID Collision". But > conflict-information is optional. I recommend making > conflict-information mandatory in this situation. However, see my > comments at the end of the section regarding the lack of clarity > whether conflict-information is mandatory or optional. > > If the 'mid' value has reached 3/4 of (2^(32) - 1) (i.e., > 3221225471) and no attack is detected, the DOTS client MUST > reset 'mid' to 0 to handle 'mid' rollover. > > It sounds like, but does not say explicitly, that mid rollover automatically > invalidates any active high-mid mitigation request, and thus, if the > client wants to maintain any existing request, it must recreate them > (necessarily with small mid values). This needs to be clarified. > > The default value of the parameter is 'true' (that is, the > mitigation starts immediately). If 'trigger-mitigation' is not > present in a request, this is equivalent to receiving a request > with 'trigger-mitigation' set to 'true'. > > The second sentence is completely redundant, but I suspect that a > practical need for it has been discovered. > > ... or the 'cuid' was generated from a rogue DOTS client. > > Probably s/from/by/. > > But it seems that there is a valid situation where duplicate cuids are > plausible, when two DOTS clients are using the same certificate to > peer with a server because that certificate is what the server > administrator provided to peer with the server. I don't know if that > is worth mentioning here, though. > > If a DOTS client is provisioned, for example, with distinct > certificates as a function of the peer server-domain DOTS > gateway, distinct 'cdid' values may be supplied by a server- > domain DOTS gateway. The ultimate DOTS server MUST treat those > 'cdid' values as equivalent. > > I'm having a hard time following this, probably because I am not > familiar with the language used to describe these situations. I think > it means > > If a DOTS client is provisioned, for example, with distinct > certificates to use to peer with distinct server-domain DOTS > gateways that peer to the same DOTS server, distinct 'cdid' > values may be supplied by the gateways to the server. The > ultimate DOTS server MUST treat those 'cdid' values as > equivalent. > > The final normative statement is clear, but it isn't clear to me how > the server can implement that, unless it is provisioned with the > knowledge that the two certificates are used by the same client. > > More subtly, if the server must treat them as equivalent, dependencies > between transactions in one transaction stream apply to the union of > the transaction streams through the two servers. E.g. the rule that > mid is nearly-monotonic and the consequences thereof. Handling this > correctly requires that the client knows that transactions through the > two gateways will be handled equivalently by one same server, and that > seems to require that the client also be configured with particular > knowledge. > > It seems to me that there are actually two cases (1) a "dumb" case > where the client happens to access the same server through two > gateways, but neither the client nor the server knows that. In that > case, the signal channel protocol "just works" normally. (2) a "smart" > case where both the client and serve must know that access through the > two gateways is considered equivalent (but the gateways do not need to > know). In that case, as long as both the client and server agree on > this equivalence, the signal channel protocol also "just works". > > It's not clear that it is necessary to document here the "smart" case, > as the needed adjustments are logically determined by the intended use > case. If it is not needed, the quoted paragraph is probably best > omitted, because trying to implement it generally would tend to cause > the "dumb" case to fail. > > If the mitigation request > contains the 'alias-name' and other parameters identifying the target > resources (such as 'target-prefix', 'target-port-range', 'target- > fqdn', or 'target-uri'), the DOTS server appends the parameter values > in 'alias-name' with the corresponding parameter values in 'target- > prefix', 'target-port-range', 'target-fqdn', or 'target-uri'. > > This sentence is not connected with any other processing -- what use > is the concatenated value put to? Also, the processing described will > NOT be done if alias-name is not present, suggesting that in some way > it is optional. Also, the phrase "the parameter values in > 'alias-name'" is undefined, as alias-name is an opaque string value. > I suspect that some aspect of the processing has not been described. > > Perhaps the meaning is that an alias is always configured as a set of > values for the other parameters, and that if a request contains both > an alias name and other parameters, the effective request is formed by > merging the two sets of parameter values. Though if that is meant, > some provision must be made for the situation where the alias gives a > value for a parameter that is contradicted by an explicit parameter in > the request. > > If the DOTS server does not find the 'mid' parameter value conveyed > in the PUT request in its configuration data [it may interpret it > in a certain way] > > It's not clear what is going on here, as "mid=..." is a mandatory part > of the Uri-Path, and any such request must be rejected. > > A DOTS server could reject mitigation requests when it is > near capacity or needs to rate-limit a particular client, for > example. > > This should be a separate paragraph, as it applies more broadly than > the conditions of the first sentence of the paragraph. Also, it > probably merits s/could/MAY/. > > Two mitigation requests from a DOTS > client have overlapping scopes if there is a common IP address, IP > prefix, FQDN, URI, or alias. > > Probably worth stating explicitly that a common port number is NOT a > factor in determining overlapping scopes. > > If the DOTS server receives a mitigation request that overlaps with > an active mitigation request, but both have distinct 'trigger- > mitigation' types, the DOTS server SHOULD deactivate (absent explicit > policy/configuration otherwise) the mitigation request with 'trigger- > mitigation' set to 'false'. > > I'm pretty sure I don't know what this means. What does "deactivate" > mean? The first time I read it, I thought it meant "delete", The > second time, I suspected it meant the opposite action of "activate", > which is what happens to a trigger-mitigation = false mitigation when > the signal channel is lost. The third time, I was wondering why > the reestablishment of the signal channel didn't automatically > cause the trigger-mitigation = false mitigation to be deactivated. > > conflict-scope: Characterizes the exact conflict scope. It may > include a list of IP addresses, a list of prefixes, a list of > port numbers, a list of target protocols, a list of FQDNs, a > list of URIs, a list of aliases, or references to conflicting > ACLs (by an 'acl-name', typically [RFC8783]). > > Note this text includes a "list of port numbers", but port numbers are > not a factor in conflicts. > > Also, is it really intended that this parameter is, effectively, only > human-readable, since there is no particular way to specify what type > of datum the value contains? > > 4.4.2. Retrieve Information Related to a Mitigation > > +-----------+----------------------------------------------------+ > | 4 | Attack has exceeded the mitigation provider | > | | capability. | > +-----------+----------------------------------------------------+ > > "mitigation provider" is used in a few places but it appears that the > intended term is "mitigator". > > +-----------+----------------------------------------------------+ > | 6 | Attack mitigation is now terminated. | > +-----------+----------------------------------------------------+ > > It seems like code 6 includes codes 5 and 7. Is this ambiguity > intended? I suspect the text that is actually wanted is "DOTS client > has withdrawn the mitigation request and the attack mitigation is now > terminated." There is a parallel issue in section 10.6.2. > > 4.4.2.1. DOTS Servers Sending Mitigation Status > > DOTS > implementations MUST use the Observe Option for both 'mitigate' and > 'config' (Section 4.2). > > It's not clear what "MUST use the Observe Option" means. Does it mean > that clients MUST use it in GET requests for 'mitigate' and 'config'? > If so, is the client allowed to use "Observe: 1", despite that this > section only discusses the "Observe: 0" case? Or does it just mean > that servers must implement it, and thus respond correctly if a client > sends it? > > 4.4.2.2. DOTS Clients Polling for Mitigation Status > > In such case, the DOTS client recalls the mitigation request by > issuing a DELETE request for this mitigation request (Section 4.4.4). > > The term "recall" is used in a few places but it seems like the > correct term is "withdraw" (section 4.4.4). > > 4.4.3. Efficacy Update from DOTS Clients > > In what way is an "efficacy update" different from an "update"? Can > "efficacy" be removed without loss, or is it a term of art for updates > to mitigation requests sent during attacks? > > It appears that an update is an "efficacy update" if and only if > "attack-status" is present. This should be stated at the beginning of > the section, as otherwise it's a mystery what distinguishes "efficacy > updates". > > 4.4.4. Withdraw a Mitigation > > Once the request is validated, the DOTS server immediately > acknowledges a DOTS client's request to withdraw the DOTS signal > using 2.02 (Deleted) Response Code with no response payload. > > s/DOTS signal/DOTS mitigation request/ > > 4.5. DOTS Signal Channel Session Configuration > > d. Acceptable signal loss ratio: Maximum retransmissions, > retransmission timeout value, and other message transmission > parameters for Confirmable messages over the DOTS signal channel. > > What are the names of these parameters in the signal-config structure? > > As such, the transmission-related > parameters ('missing-hb-allowed' and acceptable signal loss ratio) > are negotiated only for DOTS over unreliable transports. > > It seems this could be said more clearly by listing the permitted > fields: "only the 'heartbeat-interval' parameter [or whatever] is > negotiated for DOTS over reliable transports". > > 4.5.1. Discover Configuration Parameters > > At least one of the attributes 'heartbeat-interval', 'missing-hb- > allowed', 'probing-rate', 'max-retransmit', 'ack-timeout', and 'ack- > random-factor' MUST be present in the PUT request. Note that > 'heartbeat-interval', 'missing-hb-allowed', 'probing-rate', 'max- > retransmit', 'ack-timeout', and 'ack-random-factor', if present, do > not need to be provided for both 'mitigating-config', and 'idle- > config' in a PUT request. > > Must both the mitigating and idle configuration sections be present in > the PUT? Does the requirement "At least one..." apply to both > sections together or each section alone? If e.g. missing-hb-allowed > is present in one section but not the other, the wording gives a vague > suggestion that the same value is implicitly provided for the other > section. Is this true? > > The PUT request with a higher numeric 'sid' value overrides the DOTS > signal channel session configuration data installed by a PUT request > with a lower numeric 'sid' value. To avoid maintaining a long list > of 'sid' requests from a DOTS client, the lower numeric 'sid' MUST be > automatically deleted and no longer available at the DOTS server. > > Does this mean that the PUT with the higher sid installs what values > it provides on top of the current configuration, or does it mean that > the previous PUT's effect is entirely removed, that is, parameters not > given in the higher-sid PUT take their default values? Note that the > latter is resistant to problems from lost PUT requests but the former > is not. > > o If the DOTS server finds the 'sid' parameter value conveyed in the > PUT request in its configuration data and if the DOTS server has > accepted the updated configuration parameters, 2.04 (Changed) MUST > be returned in the response. > > Given the earlier statement "'sid' values MUST increase monotonically > (when a new PUT is generated by a DOTS client to convey the > configuration parameters for the signal channel).", if a server > receives a PUT with the same sid as a previous PUT then the client is > misbehaving and the server should send an error response. > > A DOTS client may issue a GET message with a 'sid' Uri-Path parameter > to retrieve the negotiated configuration. > > Does this sid value matter, or is only its presence important? Also, > you probably want to expand this to "a GET message for 'config' with a > 'sid' Uri-Path parameter ...". > > 4.5.3. Configuration Freshness and Notifications > > The underlying processing is not made clear. Roughly, it seems that > the idea is the server has the right to change the configuration > unilaterally at any time, but if the client does a GET of the > configuration, the server is required to commit that it won't change > the configuration given in the response within Max-Age Option seconds. > > Or is this talking about a mechanism where the server can, at its > initiative, tell the client how the client should behave? Which is > completely different from section 4.5.2 where the client tells the > server how to behave. > > 4.5.4. Delete DOTS Signal Channel Session Configuration > > Upon bootstrapping or reboot, a DOTS client MAY send a DELETE request > to set the configuration parameters to default values. Such a > request does not include any 'sid'. > > I would take it as assumed that when the (D)TLS connection is > established, that is, when the DOTS signal channel session is > initiated, it has the default configuration parameters. Thus the > DELETE described here is guaranteed to have no effect. But perhaps > the intention is that the signal channel is conceptualized as > persisting longer than the (D)TLS connection, and (perhaps) associated > with the cuid/cdid value. If so, that should be stated clearly. > > 4.6. Redirected Signaling > > If a DOTS server wants to redirect a DOTS client to an alternative > DOTS server for a signal session, then the Response Code 5.03 > (Service Unavailable) will be returned in the response to the DOTS > client. > > What is "the response"? It seems that this is only sensible if the > session is just being established, but there doesn't seem to be a > specific session-initiation message. If you really mean that the > server can redirect the session in response to any request, it would > be helpful to state that directly. Also, you need to specify whether > the connection to the alternate server is a new session (with > independent state) or whether it is expected to be a continuation of > the existing session (carrying the same state). > > 4.7. Heartbeat Mechanism > > For > example, if a DOTS client receives a 2.04 response for its heartbeat > messages but no server-initiated heartbeat messages, the DOTS client > sets 'peer-hb-status' to 'false'. The DOTS server then will ... > > There is a lot of detail left out here, as there are messages and > events involved that are not mentioned explicitly. I think what is > meant is "For example, if a DOTS client receives a 2.04 response for > its heartbeat messages but no server-initiated heartbeat messages, the > DOTS client sets 'peer-hb-status' to 'false' in its next heartbeat > message. Upon receiving that message, the DOTS server then will ..." > > It might be useful to explicitly state that the bodies of responses to > heartbeat requests are empty. > > 6. YANG/JSON Mapping Parameters to CBOR > > It might help the implementors to tell whether this is the same as > section 6 of RFC 8782 or not. > > 10.1. DOTS Signal Channel UDP and TCP Port Number > > IANA has assigned the port number 4646 (the ASCII decimal value for > ".." (DOTS)) ... > > Ow! > > [END] > > > > -- > last-call mailing list > last-call@xxxxxxxx > https://www.ietf.org/mailman/listinfo/last-call
Attachment:
signature.asc
Description: Message signed with OpenPGP
-- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call