Reviewer: Christopher Wood Review result: Has Issues Document: draft-ietf-6man-rfc4941bis-12 I have reviewed this document as part of the security directorate's ongoing effort to review all IETF documents being processed by the IESG. These comments were written primarily for the benefit of the security area directors. Document editors and WG chairs should treat these comments just like any other last call comments. The summary of the review is: Ready with issues High-level comments: This is a great document! I apologize for taking so long to review it. My main comments are on the details of the address generation logic and heuristics, and how it ties into the larger threat model. In general, a clear description of the threat model these temporary addresses aim to address might be worth including, perhaps by expanding the Security Considerations. Comments: Section 1. The default address selection for IPv6 has been specified in [RFC6724]. The determination as to whether to use stable versus temporary addresses can in some cases only be made by an application. For example, some applications may always want to use temporary addresses, while others may want to use them only in some circumstances or not at all. An Application Programming Interface (API) such as that specified in [RFC5014] can enable individual applications to indicate a preference for the use of temporary addresses. I wonder if this should mention TAPS, which has discussed APIs for this sort of selection in the past. See https://tools.ietf.org/html/draft-ietf-taps-interface-10#section-5.2.13. Section 1.2. The correlation can be performed by: <snip list> This probably isn't exhaustive, so perhaps: "Correlation can be performed by a variety of attackers, including, though not limited to:" (or something to that effect). Section 2.1. One of the requirements for correlating seemingly unrelated activities is the use (and reuse) of an identifier that is recognizable over time within different contexts. IP addresses provide one obvious example, but there are more. For example, What about MAC addresses? As I understand it, most systems are moving towards MAC address randomization, though it's still probably worth mentioning. Likewise, similar to cookies, one could also mention TLS (or transport) layer identifiers, such as TLS session tickets. This is touched on somewhat in the Security Considerations. Section 2.2. To make it difficult to make educated guesses as to whether two different interface identifiers belong to the same host, the algorithm for generating alternate identifiers must include input that has an unpredictable component from the perspective of the outside entities that are collecting information. It seems like this "must" be normative, and should probably reference the RFC4086 [https://tools.ietf.org/html/rfc4086]. Section 3.1. 3. New temporary addresses are generated over time to replace temporary addresses that expire. I assume expiration here means that the address is deprecated, right? If so, that might be worth clarifying. 4. <snip> The lifetime of temporary addresses must be statistically different for different addresses, such that it is hard to predict or infer when a new temporary address is generated, or correlate a newly- generated address with an existing one. This "must" is not normative, right? I assume not, since the previous guideline in this item ("the lifetime of an address should be further reduced when privacy-meaningful events ... takes place") does not require all temporary addresses to cease working. It might be better to drop the "or correlate a newly-generated address with an existing one" bit. Moreover, what does "statistically different" mean, precisely? It might be more accurate to talk about this property from the perspective of the adversary. For example, I think this is trying to say that given two different temporary addresses, an adversary must have negligible probability in determining whether or not they correspond to the same or different sources. (That would match better with the Randomized Interface Identifier algorithms given in Section 3.3.) Section 3.2. This document also assumes that an API will exist that allows individual applications to indicate whether they prefer to use temporary or stable addresses and override the system defaults (see e.g. [RFC5014]). If a reference to TAPS is made for these APIs, it is probably also worth including here. Section 3.3. The algorithm specified in Section 3.3.1 benefits from a Pseudo-Random Number Generator (PRNG) available on the system. What does "benefits" mean here? If we're specifying an algorithm to generate random values, shouldn't a PRNG be *required*? Section 3.3.2. This section assumes a "hash-based" algorithm, but is specified using a PRF. Later, in the text, it reads: F() could be the result of applying a cryptographic hash over an encoded version of the function parameters. But a cryptographic hash is not a PRF. If the hash function is meant to be keyed, even that probably isn't sufficient. (Some constructions, like H(k || m) for secret k and input m, are vulnerable to length extensions.) I think it's probably safest to recommend a particular construction, such as HKDF with secret_key and output length equal to the number bytes needed for the interface identifier. Moreover, requirements for secret_key are not really strict enough. There's text about F(), e.g.,: F() MUST also be difficult to reverse, such that it resists attempts to obtain the secret_key And it is said that secret_key "SHOULD be of at least 128 bits," but what if it's less? What if it only has a single byte of entropy? Section 3.4. Constants here are used before defined. Moving Section 3.8 to somewhere before Section 3.4 might help. What happens if the constants are chosen such that the rule (5) is not possible to achieve? Section 3.6. The frequency at which temporary addresses change depends on how a device is being used (e.g., how frequently it initiates new communication) and the concerns of the end user. The most egregious privacy concerns appear to involve addresses used for long periods of time (weeks to months to years). The more frequently an address changes, the less feasible collecting or coordinating information keyed on interface identifiers becomes. Moreover, the cost of collecting information and attempting to correlate it based on interface identifiers will only be justified if enough addresses contain non-changing identifiers to make it worthwhile. Thus, having large numbers of clients change their address on a daily or weekly basis is likely to be sufficient to alleviate most privacy concerns. I don't disagree with the text, but is there anything we can cite here? Why do we think it's "sufficient," for example? Finally, when an interface connects to a new (different) link, existing temporary addresses for the corresponding interface MUST be eliminated, and new temporary addresses MUST be generated immediately for use on the new link. If the addresses are eliminated, how does one run DAD and ensure that the same (or similar) addresses are not used on the new link? Section 3.7. Devices implementing this specification MUST provide a way for the end user to explicitly enable or disable the use of temporary addresses. Why is this a MUST, rather than a SHOULD? Since this is effectively describing an API, I think this ought to be relaxed. Section 6. An implementation might want to keep track of which addresses are being used by upper layers so as to be able to remove a deprecated temporary address from internal data structures once no upper layer protocols are using it (but not before). It seems an application might also want to consider other information linkable to select addresses in the future. For example, TLS resumption may link clients across two different temporary addresses. (This goes back to my comment on Section 2.1 above.) -- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call