I am the assigned Gen-ART reviewer for this draft. For background on Gen-ART, please see the FAQ at <http://wiki.tools.ietf.org/area/gen/trac/wiki/GenArtfaq>. Please resolve these comments along with any other Last Call comments you may receive. Document: draft-ietf-mboned-mtrace-v2-08 Reviewer: David L. Black Review Date: 21 July 2011 IETF LC End Date: 19 July 2011 Summary: This draft is on the right track but has open issues, described in the review. This draft describes version 2 of multicast traceroute. Major Issues: This review is late because it took me much longer than expected to work through this draft in order to figure out how the protocol works. While I believe that the protocol does work if correctly implemented, I don't believe that the protocol is effectively implementable from this draft, and hence significant work is required. The open issues fall into two primary areas: - Message Structure: Sections 4 and 5 are confusing. - Message Processing: The most important functionality of this protocol is the router processing of messages in Section 9. I found numerous problems there, including significant missing pieces of the specification (e.g., the material on when and how to generate Reply messages is incomplete). In addition there are some missing security considerations around traffic amplification. -- Message Structure -- The text in sections 4 and 5 is confusing. The problem starts with the following sentence in section 4.2: An mtrace2 message MUST contain exactly one Mtrace2 Query header. Because the mtrace2 Query header is not discussed until section 5, this sentence is easy to misread as a requirement for exactly one Mtrace2 Query TLV. It gets worse, because it turns out that the Mtrace2 Query header is actually the value[V] for the Query, Request and Reply TLVs. Hence, calling that a Query header is an invitation to confusion (e.g., where are the Request and Reply headers?). I suggest renaming the Mtrace2 Query header to Mtrace2 Trace header and then changing the above sentence to: An mtrace2 message MUST begin with a Query, Request or Reply TLV. The value for each of these TLVs is an Mtrace2 Trace Header (see section 5). These three TLVs MUST NOT be used in any position other than the beginning of a message. That is not easy to figure out from the current text. 5. Mtrace2 Query Header What's the maximum size of the UDP packet? This is important because there's text in Section 9 on that depends on whether another TLV fits into the message or not. There needs to be a minimum size, as nothing will work if there isn't room for at least one standard response block after the header. How does one figure out whether the addresses in the Mtrace2 Query Header are IPv4 or IPv6? I suspect that at least one of the multicast address and source address needs to be valid (i.e., not all 1's for IPv4, not :: for IPv6). That should be stated, along with what happens when both addresses are invalid. State that the client address must be valid (not all 1's for IPv4, not :: for IPv6). Add the 24 bits of TLV to the packet structure diagrams in sections 5, 6, 7 and 8. Sections 6 and 7 - The IPv4 and IPv6 response blocks use the same TLV code (4), but have different contents and lengths. How does the receiver determine which response block is present for TLV code 4? Section 8 Mtrace2 Augmented Response Block The augmented response block is always appended to mtrace2 TLV header (0x04). No, definitely not. Section 4.2 says that the mtrace2 TLV code for an augmented response block is 5, not 4. The following would be better: The augmented response block uses TLV code 5 and always comes after a standard response block TLV (code 4). -- Message Processing -- Section 9 is confusing and incomplete. This is the most important functionality in the document, and it needs to be clear to the reader. I'm having a very hard time figuring out exactly what the router is supposed to do, and I found a number of things that are clearly wrong. It is not sufficient to just fix the things noted in this review - Section 9 needs to be clear and cogent about exactly what the router does for each received message so that code can be written from it. Section 9.1.1 Packet Verification If the router determines that it is not the proper last-hop router, or it cannot make that determination, it does one of two things depending if the Query was received via multicast or unicast. If the Query was received via multicast, then it MUST be silently dropped. If it was received via unicast, a forwarding code of WRONG_LAST_HOP is noted and processing continues as in Section 9.2. The last sentence has a number of problems: - It needs to say that the TLV type in the first TLV is changed from 1 to 2, - Section 9.2 requires that there be a response block, but the Query message doesn't have one. - Section 9.2 will forward the Request towards the multicast source, even though the Query was not received by a proper last-hop router. That seems wrong. It would be clearer to describe how to return the WRONG_LAST_HOP error to the client here instead of referencing Section 9.2. Duplicate Query messages as identified by the tuple (Mtrace2 Client Address, Query ID) SHOULD be ignored. This MAY be implemented using a simple 1-back cache (i.e. remembering the Mtrace2 Client Address and Query ID of the previous Query message that was processed, and ignoring future messages with the same Mtrace2 Client Address and Query ID). Duplicate Request messages MUST NOT be ignored in this manner. Is a 1-entry cache really sufficient? What if 2 clients are trying to query concurrently causing their duplicate messages to be interleaved? 9.1.2. Normal Processing When a router receives an mtrace2 Query and it determines that it is the proper last-hop router, it changes the TLV type to 0x2 and treats it like an mtrace2 Request and performs the steps listed in Section 9.2. That's problematic - Section 9.2 requires that there be a response block, but there won't be one. Unlike the error case above, this mismatch can probably be addressed in Section 9.2. 9.2.2. Normal Processing Step 1 in this section is where the Request message needs to have a response block. The problems with the second reference from section 9.1 to section 9.2 might be correctable by explaining why the "If there was no room" condition can't occur at the router that converts a Query to a Request. This case needs to be noted in the initial text in Section 9.2. Also in step 1, how does the router determine whether there is "room in the current buffer"? 9.2.3. Mtrace2 Request Received by Non-Supported Router Describe how the client matches the ICMP message with its original Query. 9.3.1. Destination Address If the Previous-hop router for the mtrace2 Request is known for this request and the number of response blocks is less than the number requested (i.e., the "# hops" field in the mtrace2 Query header), the packet is sent to that router. If the Incoming Interface is known but the Previous-hop router is not known, the packet is sent to an appropriate multicast address on the Incoming Interface. The appropriate multicast address may depend on the routing protocol in use, MUST be a link-scoped group (i.e. 224/24 for IPv4, FF02::/16 for IPv6), MUST NOT be ALL-SYSTEMS.MCAST.NET (224.0.0.1) for IPv4 and All Nodes Address (FF02::1) for IPv6, and MAY be ALL-ROUTERS.MCAST.NET (224.0.0.2) for IPv4 or All Routers Address (FF02::2) for IPv6 if the routing protocol in use does not define a more appropriate group. Otherwise, it is sent to the Mtrace2 Client Address in the header. There are multiple problems with that paragraph: 1) First line: "response blocks" is wrong. The treatment of the augmented response block's count of already returned response blocks is missing. 2) There are no instructions on what to do when the number of response blocks is equal to the number of hops - the Request has to be converted to a Reply. 3) The last line sends a Request to a Client. If that's what was intended, the Request first needs to be converted to a Reply. 9.3.2. Source Address When the NO_SPACE error occurs, the router sends back the mtrace2 Reply with contained data and the NO_SPACE error code as in Section 9.4, and continues the mtrace2 Query by sending an mtrace2 Request containing the same mtrace2 Query header and its standard and augmented response blocks. The corresponding augmented response block type is "# Mtrace2 Response Blocks Returned" described in Section 8. There are a couple of problems with that paragraph: 1) What does "its" mean? If it refers to the header, that would send all of the response blocks. I think only this router's response block is sent. 2) How does the hop count arithmetic work? I suspect that the value for number of response blocks returned is counted against hop count, but that's not mentioned in section 9.3.1 . 14. Security Considerations There are additional security considerations. I can see at least two ways in which this protocol appears to offer opportunities for traffic amplification: 1) Suppose the client's address in the Query is a multicast address; the Reply will get multicast by the router to a multicast address that the client can't reach. Use of a multicast address as a client address should probably be prohibited. 2) The proxying and NO_SPACE behaviors result in one Query returning multiple Reply messages. That's necessary for the protocol to work, but needs some control to prevent abuse. Some sort of rate-limiting recommendations should suffice to prevent this from becoming a problem. Minor Issues: Section 1 Introduction The last 3 paragraphs on p.6 are not clear about how hop count works. The statement in the last paragraph on p.6 that a lower hop count can be used is not consistent with the summary descriptions in the previous two paragraphs that don't discuss hop count. That needs to be clearer. One possible approach would be: - Add a sentence to the third to last paragraph (about generation of Request from Query) to say that the hop count in the Query is copied to the Request. - Add a new paragraph following that paragraph to say that when a router that is not a first-hop router receives a Request message, the router adds a standard response block If there are fewer standard response blocks that the hop count, the Request is forwarded. If the number of response blocks matches the hop count or an error condition such as "no route" is encountered, a Reply is generated. There is significant overlap between sections 1 and 3 - consider whether they should be merged. Of the two sections, the contents of Section 3 would be a better Introduction than the current section 1. Nits: Add a definition of First-hop router to Section 2. idnits 2.12.12 found a bunch of problems Checking nits according to http://www.ietf.org/id-info/checklist : ---------------------------------------------------------------------------- == There are 7 instances of lines with non-RFC2606-compliant FQDNs in the document. == There are 1 instance of lines with multicast IPv4 addresses in the document. If these are generic example addresses, they should be changed to use the 233.252.0.x range defined in RFC 5771 The above 2 errors can be ignored - they FQDNs are MCAST.NET addresses that this draft needs to use, and the multicast IPv4 address is the IPv4 address for one of those FQDNs. Miscellaneous warnings: ---------------------------------------------------------------------------- == The document seems to lack the recommended RFC 2119 boilerplate, even if it appears to use RFC 2119 keywords -- however, there's a paragraph with a matching beginning. Boilerplate error? (The document does seem to have the reference to RFC 2119 which the ID-Checklist requires). -- The document date (January 7, 2011) is 195 days in the past. Is this intentional? Checking references for intended status: Proposed Standard ---------------------------------------------------------------------------- (See RFCs 3967 and 4897 for information about using normative references to lower-maturity documents in RFCs) == Unused Reference: '2' is defined on line 1413, but no explicit reference was found in the text == Unused Reference: '5' is defined on line 1422, but no explicit reference was found in the text ** Obsolete normative reference: RFC 2373 (ref. '3') (Obsoleted by RFC 3513) ** Obsolete normative reference: RFC 2434 (ref. '4') (Obsoleted by RFC 5226) ** Downref: Normative reference to an Unknown state RFC: RFC 1071 (ref. '5') == Outdated reference: A later version (-11) exists of draft-ietf-mboned-auto-multicast-08 _______________________________________________ Ietf mailing list Ietf@xxxxxxxx https://www.ietf.org/mailman/listinfo/ietf