I've reviewed this document as part of the transport area directorate's ongoing effort to review key IETF documents. These comments were written primarily for the transport area directors, but are copied to the document's authors for their information and to allow them to address any issues raised. When done at the time of IETF Last Call, the authors should consider this review together with any other last-call comments they receive. Please always CC tsv-dir@xxxxxxxx if you reply to or forward this review. This draft is on the right track but has open issues, described in the review. Review - good news: I have reviewed few selected aspects in draft-ietf-core-coap-09 (http://www.ietf.org/mail-archive/web/core/current/msg03280.html). I confirm that those past concerns are sufficiently addressed by this document. Review - not so good news: * General: In a nutshell, this document proposes a rather lightwight protocol that provides a subset of TCP/HTTP transport. In order to reduce message sizes and implementation complexity, the protocol sacrifies many TCP/HTTP features. But I really had a hard time figuring out what the protocol as defined in this document does *not* provide, in particular in the message layer. At first sight, the CoAP protocol as defined in this document lacks features such as: - Support for messages exceeding the path MTU - Byte stream transport with segmentation and reassembly - Flow control - Congestion control for non-confirmable messages (this must IMHO be fixed) Further typical TCP features are pretty much left to the implementation or to extensions which will make the protocol more complex (imho, as complex as TCP): - In-order delivery of unconfirmed messages (for confirmed messages, delivery seems to be in-order right now if the implementation indeed complies to the mandated limit of one out-standing transaction per destination, but any application requiring a data transfer of more than 1KB will need something better) - Strong protection against message duplication (in particular if some checks are disabled based on cross-layer assumptions, which is allowed by this spec) - Non-trivial transport features for multicast - Security and DoS protection (mostly out-of-scope of this review) These aspects are further detailed below with specific text references. => I think that the document needs a disclaimer in Section 1 that explicitly explains what users cannot expect from the CoAP base protocol (say, compared to a light-weight HTTP/TCP implementation with HTTP compression and the absolute minimum set of TCP features). * General: The message layer, which basically provides a transport protocol service, is in parts only vaguely specified, and many transport-related protocol features can be overwriten by implementation or environment-specific settings, or by future extensions drafts. This makes it very difficult to review the protocol regarding completeness and robustness, such as atypical packet arrival patterns, reordering, and other corner cases that fundamentally matter for a transport protocol design. I believe that the protocol is simple enough that a full description of the state engine and event processing would be possible (like RFC 793 Section 3.9. Event Processing). But without a rigourous specification, it is difficult to figure out what a CoAP implementation would do in many corner cases, and if interoperable implementations would interprete the spec in the same way. => The following list of open issues is almost certainly incomplete; other TSV experts might identify further problems. * Section 4.2 A CoAP endpoint that sent a Confirmable message MAY give up in attempting to obtain an ACK even before the MAX_RETRANSMIT counter value is reached: E.g., the application has canceled the request as it no longer needs a response, or there is some other indication that the CON message did arrive. In particular, a CoAP request message may have elicited a separate response, in which case it is clear to the requester that only the ACK was lost and a retransmission of the request would serve no purpose. However, a responder MUST NOT in turn rely on this cross-layer behavior from a requester, i.e. it SHOULD retain the state to create the ACK for the request, if needed, even if a Confirmable response was already acknowledged by the requester. => I think that this situation can also occur during an attack with spoofed addresses, i. e., it is not "clear" that the ACK was lost. In that case, retransmitting the request may even be the better alternative, in order to identify the attack. As already mentioned, state diagrams and a clear event handling would help to identify such corner cases (there may be more than this specific one). This would also simplify discussion when it is indeed save to release state information. * Section 4.3 At the CoAP level, there is no way for the sender to detect if a Non- confirmable message was received or not. A sender MAY choose to transmit multiple copies of a Non-confirmable message within MAX_TRANSMIT_SPAN, or the network may duplicate the message in transit. => This section lacks any guidance on how frequently non-confirmable messages may be sent. Section 4.7 mandates a maximum PROBING_RATE for congestion control. With the default parameters, MAX_TRANSMIT_SPAN is 45s, and PROBING_RATE is 1 Byte/second, i. e., for messages larger than 45 Byte, the limit for multiple copies is given by MESSAGE_SIZE/PROBING_RATE, not by MAX_TRANSMIT_SPAN. `* Section 4.5 o A constrained server MAY even want to relax this requirement for certain non-idempotent requests if the application semantics make this trade-off favorable. For example, if the result of a POST request is just the creation of some short-lived state at the server, it may be less expensive to incur this effort multiple times for a request than keeping track of whether a previous transmission of the same request already was processed. => I think that this section must stronger state that both endpoints must agree on those modified semantics. Otherwise, it is not clear to me whether the client and server implementations would indeed be interoperable, in particular, if they are implemented independently and thus make different assumptions. The client here asked for reliable transfer, but the server actually ignores that requests for reliabile transfer, right? * Section 4.6 Message sizes are also of considerable importance to implementations on constrained nodes. Many implementations will need to allocate a buffer for incoming messages. If an implementation is too constrained to allow for allocating the above-mentioned upper bound, it could apply the following implementation strategy: Implementations receiving a datagram into a buffer that is too small are usually able to determine if the trailing portion of a datagram was discarded and to retrieve the initial portion. So, if not all of the payload, at least the CoAP header and options are likely to fit within the buffer. A server can thus fully interpret a request and return a 4.13 (Request Entity Too Large) response code if the payload was truncated. A client sending an idempotent request and receiving a response larger than would fit in the buffer can repeat the request with a suitable value for the Block Option [I-D.ietf-core-block]. => This document must include a discussion on flow control, i. e., what happens if the receiver's receive buffer is full or if an application stalls and does not consume data for longer time (exceeding the retransmission timeout). Explanation: For constrainted devices with small receive buffers and communication with more than one endpoint, it seems to me pretty likely that at some points in time no receive buffer is available. The protocol spec does not discuss what happens if the buffer is to small to process even the header, and what the behavior of the receiver should be (silently dropping the incoming message? sending a RST? does the behavior depend on whether it is CON or NON?). I think that this spec must provide guidance how the protocol deals with buffer shortage. TCP's solution to this kind of situations is flow control by the receive window. In CoAP, there seems to be an implicit assumption that messages can either always be "somehow" processed by a receiver or savely be dropped. As long as the protocol allows only one outstanding transaction per destination, and allocates dedicated receive buffer for a full CoAP packet for each destination, out-of-my head this indeed seems to work without deadlocks because we basically have the alternating-bit-protocol. But in more complex situations with small buffer sizes (e. g., multiple transactions/applications sharing one buffer, or insequence-delivery for more than one transaction), I think that the protocol could run into deadlocks because it cannot prevent a sender from sending or retransmitting data into a receiver not having any receive buffer. I am not an expert on formal protocol verification, i. e., I cannot provide an exact specification for the minimum set of implementation requirements that savely prevents deadlock (also see my other remarks on state engine specification). But I am really concerned that the document does not even mention the terms "flow control", "buffer sizing", etc. * Section 4.7 In order not to cause congestion, Clients (including proxies) MUST strictly limit the number of simultaneous outstanding interactions that they maintain to a given server (including proxies) to NSTART. An outstanding interaction is either a CON for which an ACK has not yet been received but is still expected (message layer) or a request for which neither a response nor an Acknowledgment message has yet been received but is still expected (which may both occur at the same time, counting as one outstanding interaction). The default value of NSTART for this specification is 1. => This section MUST clarify congestion control for non-confirmable messages. I miss a clear recommendation how frequently a sender is allowed to send non-confirmable messages if there is no other feedback. I think that a maximum data rate of PROBING_RATE would be reasonable and save, but I recall some discussion on other proposals (e. g., mandating a confirmable message every X non-confirmable messages, etc.). * Section 4.8.2 o PROCESSING_DELAY is the time a node takes to turn around a Confirmable message into an acknowledgement. We assume the node will attempt to send an ACK before having the sender time out, so as a conservative assumption we set it equal to ACK_TIMEOUT. => I assume that the spec wants to say "a receiver MUST have sent an ACK after PROCESSING_DELAY"? I have not found that requirement elsewhere in the document. If it is not a MUST requirement, the calculations involving PROCESSING_DELAY seem to be not the worst case and are therefore not really useful for worst-case analysis. * Section 5.3.1 A token is intended for use as a client-local identifier for differentiating between concurrent requests (see Section 5.3); it could have been called a "request ID". => Im my understanding, concurrent requests are not allowed by this spec, i. e., why does this document not recommend to use an empty token as long as NSTART=1? It apparently just wastes scace bandwidth if there is only one allowed request to a destination. As an editorial note, this reference to Section 5.3 is strange here; this is the only paragraph in the document where concurrent requests are mentioned. * Section 5.3.2 The exact rules for matching a response to a request are as follows: 1. The source endpoint of the response MUST be the same as the destination endpoint of the original request. 2. In a piggy-backed response, both the Message ID of the Confirmable request and the Acknowledgement, and the token of the response and original request MUST match. In a separate response, just the token of the response and original request MUST match. In case a message carrying a response is unexpected (the client is not waiting for a response from the identified endpoint, at the endpoint addressed, and/or with the given token), the response is rejected (Section 4.2, Section 4.3). => To me, the CoAP message processing seems underspecified. What really happens if either the msg and token mismatch (two entirely different cases), i. e., what will the endpoint put into the RST message? Section 4.2 states "The Acknowledgement message MUST echo the Message ID of the Confirmable message, and MUST carry a response or be empty (see Section 5.2.1 and Section 5.2.2)."; based on text I cannot figure out what the response would be. For interoperability between implementations, this sort of events matter. => Would it be allowed to send back a response both by a CON and a NON message, with the same token, but different message IDs? If so, how would the matching deal with this? * Section 5.3.2 Implementation Note: A client that receives a response in a CON message may want to clean up the message state right after sending the ACK. If that ACK is lost and the server retransmits the CON, the client may no longer have any state to correlate this response to, making the retransmission an unexpected message; the client may send a Reset message so it does not receive any more retransmissions. This behavior is normal and not an indication of an error. (Clients that are not aggressively optimized in their state memory usage will still have message state that will identify the second CON as a retransmission. Clients that actually expect more messages from the server [I-D.ietf-core-observe] will have to keep state in any case.) => I am confused by this sort of argument of removing state. This statement probably refers to Token state, since some kind of Message ID state has to be kept at least for MAX_LATENCY according to Section 4.8.2? Again, I'd expect the protocol specification to clearly state what the minimum requirements on keeping state are. * Section 5.4 and Section 5.10 => The maximum size of these options, in particular if more than one is used at the same time, can easily exceed the IPv6 MTU of 1280 bytes. In other words, a single non-fragmented IP packet will not only have not enough space for payload if options are used, possibly a single packet will not even be sufficient to transport all required options? What does the CoAP base protocol do in that case? Discard that request/response and return an application error? Why does Section 5 not have any guidance on size/segmentation issues if options are (too) large? * Section 8.1 A multicast request is characterized by being transported in a CoAP message that is addressed to an IP multicast address instead of a CoAP endpoint. Such multicast requests MUST be Non-confirmable. => A normative statement on congestion control for *sending* to multicast addresses is missing. I think that a slow-speed network can get very easily congested by multicast messages, i. e., this matters for the main CoAP use cases. I believe that sending 1 Byte/second is save for multicast destinations. * Section 8.1 When a server is aware that a request arrived via multicast, it MUST NOT return a RST in reply to NON. If it is not aware, it MAY return a RST in reply to NON as usual. Because such a Reset message will look identical to an RST for a unicast message from the sender, the sender MUST avoid using a Message ID that is also still active from this endpoint with any unicast endpoint that might receive the multicast message. => Why is a RST forbidden by a MUST? I would understand the motivation for a SHOULD, but if a server is overloaded by multicast requests and runs out of processing resources for multicast requests, isn't there a need to tell the sender that it has to stop using this multicast group? * Section 8.2 When matching a response to a multicast request, only the token MUST match; the source endpoint of the response does not need to (and will not) be the same as the destination endpoint of the original request. => So, the token is the only way to deal with packets that are duplicated in the network? Then, this section must IMHO expand further on how to select token IDs for multicast transfer. For use in multicast, Section 5.3.1. "The client SHOULD generate tokens in such a way that tokens currently in use for a given source/destination endpoint pair are unique." is not sufficient; the token must in addition be unique during MAX_LATENCY, right? * Section 8.2 If a server does decide to respond to a multicast request, it should not respond immediately. => The spec leaves open if a server is allowed to respond with confirmed message. If a large number of servers respond, the ACK traffic for many CONs could be an issue, right? But if only NON is allowed, what happens if the server wants that its message is indeed delivered reliably to the requester? * Section 8.2 E.g., for a multicast request with link-local scope on an 2.4 GHz IEEE 802.15.4 (6LoWPAN) network, G could be (relatively conservatively) set to 100, S to 100 bytes, and the target rate to a conservative 8 kbit/s = 1 kB/s. The resulting lower bound for the Leisure is 10 seconds. => While I like the idea of randomizing the response time to avoid in-cast problems, according to Section 4.8, a conservative assumption about the allowed data rate in a potentially congested network is PROBING_RATE = 1 Byte/second. 1 kB/s might be realistic in a specific application scenario if the network does not have any other traffic, but the attribute "conservative" should not be used here, because reality with cross-traffic could be entirely different. * Section 8.2 If a CoAP endpoint does not have suitable data to compute a value for Leisure, it MAY resort to DEFAULT_LEISURE. => With this vague specification of leisure time the client has no means to know whether *any* response will ever arrive. The servers could, for instance, err on the size of the group and just pick all a large random leasure time. I think it would make sense to define an upper limit on the leasure time, to allow some interpretation on the client side. If this upper limit significantly exceeds the rate PROBING_RATE, servers may just randomly decide not to reply, instead of waiting for a long time. * Section 8.2.2 When a forward-proxy receives a request with a Proxy-Uri or URI constructed from Proxy-Scheme that indicates a multicast address, the proxy obtains a set of responses as described above and sends all responses (both cached-still-fresh and new) back to the original client. => I don't understand from the document how this works. For instance, will these responses all have the same token? How can a client process this if it expects only one response from the proxy? My general impression is that the multicast mode of CoAP would require a more rigorous specification for being included in a PS document. * Section 9 DTLS is not applicable to group keying (multicast communication); however, it may be a component in a future group key management protocol. => I am not really familiar with DTLS. But communication to multicast addresses by CoAP cannot be secured by DTLS, right? If so, why is there not a big warning sign "DTLS is not available for multicast CoAP"? * Section 11 => This section IMHO lacks the description of two further attacks: (a) The equivalent of a SYN flooding attack on TCP would be sending complex queries with CON to a server. Given that the cost of a CON request is small, this attack can easily be executed. Also, if the server responds with CONs, it will have to allocate buffer and retransmission logic for each request, and it will likely run out of resources. A simple remedy is rate limiting as mentioned in Section 4.7; this counter-measure should be repeated here. (b) A subtle attack with spoofed addresses could possibly exploit the lack of congestion control in CoAP. Due to NSTART=1, a tricky attacker could prevent a server to communicate with a legitime client, because only one transaction is allowed to one destination address. The attacker could try to always occupy this "slot". Both attacks are due to the lack of a three-way handshake like in TCP. * Section 11 => This section IMHO needs a discussion on minimum requirements on how to select Message ID and Tokens. Both are a means to protect against "hijacking" of transactions / falsification of responses, but if an attacker can guess these values, an attacker can inject wrong data into a CoAP communication. Compare e.g. to a TCP receiver that carefully checks whether sequence numbers are valid, i.e., within the receive window. Editorial nits: * Section 2.2 CoAP makes use of GET, PUT, POST and DELETE methods in a similar manner to HTTP, with the semantics specified in Section 5.8. (Note that the detailed semantics of CoAP methods are "almost, but not entirely unlike" those of HTTP methods: => s/unlike/like/ ? * Section 3 Following the header, token, and options, if any, comes the optional payload. If present and of non-zero length, it is prefixed by a fixed, one-byte Payload Marker (0xFF) which indicates the end of options and the start of the payload. The payload data extends from after the marker to the end of the UDP datagram, i.e., the Payload Length is calculated from the datagram size. The absence of the Payload Marker denotes a zero-length payload. The presence of a marker followed by a zero-length payload MUST be processed as a message format error. => I think that the term "payload marker" is kind of dangerous; it would be better to use a term like "end-of-option option". When I first read this section, I wondered whether a CoAP implementation could just scan through the packet to find the begin of the payload by the first occence of 0xFF after the default CoAP header. However, this would require 0xFF to be masked in all options. Masking is realized in Section 3.1, but apparently not in Sections 3.2 and Section 5.4. * Section 4.4 The same Message ID MUST NOT be re-used (in communicating with the same endpoint) within the EXCHANGE_LIFETIME (Section 4.8.2). Implementation Note: Several implementation strategies can be employed for generating Message IDs. In the simplest case a CoAP endpoint generates Message IDs by keeping a single Message ID variable, which is changed each time a new Confirmable or Non- confirmable message is sent regardless of the destination address or port. Endpoints dealing with large numbers of transactions could keep multiple Message ID variables, for example per prefix or destination address. The initial variable value should be randomized. => Using a single Message ID variable IMHO is only possible if there is only a single message outstanding to any address, because the Message ID has to be kept for verifying responses. Which implies that even in the "simplest case" there is also one Message ID variable per address. I wonder whether the Implementation Note should be something of the sort "implementations will typically store Message IDs per destination, but they may use a single counter to ensure uniqueness among several destinations". * Section 4.6 header and options are likely to fit within the buffer. A server can thus fully interpret a request and return a 4.13 (Request Entity Too Large) response code if the payload was truncated. A => The syntax "4.13" is not introduced at this stage; it could make sense to add a brief sentence early in the document to explain the response code format * Section 4.8 Message transmission is controlled by the following parameters: => At least DEFAULT_LEISURE is not defined in the text until this table (and it is not really self-explaining). * Section 4.8.2 => The whole section on time values derived from transmission parameters is pretty hard to parse. Instead of organizing it according parameters, it would be better to highlight the subset of parameters that actually matter for an implementation, and what is exactly the event at the beginning and end of that duration. * Section 5.3.1 The Token is used to match a response with a request. The token value is a sequence of 0 to 8 bytes. => While CoAP optimizes its protocol fields for single bits, the document does not comment at all on reasonable sizes for the token. At least some text mentioning the high overhead of a 4 or 8 byte token compared to the rest of the CoAP headers could be useful. Possibly also addressing the security-size tradeoff. * Section 5.10 => I don't understand why the Proxy-URI is longer than others, and why the length is 1034. Finally, please note that I am not subscribed to the core WG mailing list. Thanks Michael