Reviewer: Martin Thomson Review result: Not Ready As a bit of a preface here, I've been aware of this work for some time, but have refrained from offering opinion. As liaison to the W3C, I felt like my responsibility was to facilitate the conversation. However, as Barry has asked me to do a review, I feel obligated to set that attempt at neutrality aside. The biggest issue here is that this document is very much unwanted by its target audience, as is its antecedent, RFC 6874. I share that concern. I have clear, strong indications from folks at three major browsers (conveniently, I am at W3C TPAC this week and so was able to talk with a few people directly on this topic; inconveniently, I've not had a lot of time for this review) that this change to the URI specification is not just something that they don't want to implement, but that it is not good for the Web. Public communications from them will be somewhat more polite and circumspect, but it has been made clear to me that this change is not wanted. The IETF does occasionally publish specifications that don't end up being implemented, but we usually look for signals that a protocol might be implemented before even starting work. Here, we have a strong signal that a specification won't be implemented. Mark Nottingham asks the same question as well as point 1 in [1]. (I don't personally find his second and third points to be especially problematic given adequate consultation, but even there, there are a few concerns that I will outline below.) I do want to give due credit to the authors - Brian in particular - for being very open and forthright in their consultation with the affected constituency. They have been proactive and responsive in a nearly exemplary fashion. Overall, I think that it would be better for the IETF to declare RFC 6874 as Historic(al). There might be some residual value in RFC 4007 from a diagnostic perspective, but the use of zone identifiers in URIs seems fundamentally incompatible with the goals of URIs. I do recognize that the Web and HTTP is not the only protocol affected by this sort of change. The goal is to change all URI types. However, I believe that HTTP is pretty important here and I have a fair sense that the sort of concerns I raise with respect to HTTP apply (or should apply) to other schemes. --- There are a few technical concerns I have based on reviewing the draft. Some of these - on their own - are significant enough to justify not publishing this document. Inclusion of purely local information in the *universal* identity of a resource runs directly counter to the point of having a URI. This creates some very difficult questions that the draft does not address. For instance (1), the Web security model depends on having a clear definition for the origin of resources. The definition of Origin depends on the representation of the hostname and it relies heavily both on uniqueness (something a zone ID potentially contributes toward) and consistency across contexts (which a zone ID works directly against). Now, arguably the identity of resources that are accessed by link-local URIs don't need and cannot guarantee either property, but this is an example of the sorts of problem that needs to be dealt with when local information is added to a component that is critical to web security. For instance (2), in HTTP and several other protocols, servers depends on the host component - as it appears in the URI - to determine authority. If there is no rule for stripping the zone ID from URIs, servers hostname checks will depend on the client. That exposes link-local servers to information that they need to filter out. Some might not be prepared to do that. Hostname checks are critical for security, especially the consistent treatment of the field across different components like serving infrastructure, web application firewalls, access control modules, and other components. This is a non-backwards-compatible change to RFC 3986. The only issue related to this that is addressed in the draft is the question of document management - this updates RFC 3986 - but surely there are other concerns that might need to be addressed. I see some effort to address software backwards-compatibility in discussion threads, but I found very little in the draft itself. The configuration of zones on a machine is could be private information, but this information is being broadcast to servers. In HTTP, that is in Host header fields; on the Web, in document.location. This information might contribute significant amounts of information toward a fingerprint. I appreciate that the stripping of zone ID was never implemented, but it is a useful feature. Arguments in Section 5 depend on the zone IDs being hard to guess, but that isn't true. Zone IDs are - in practice - low entropy fields. More critically, they are fields that are sent to servers. Zone ID size is not bounded - most implementations will have a size limit on the authority or host portion of a URI (256 octets is sufficient for current names), but the implication is that Zone IDs could be arbitrary length. Though percent-decoding is not likely to be a concern from a specification perspective (the operative specification from the browser perspective does not apply pct-decoding to a v6 address [2]), what work has been done to verify that a zone ID won't break existing software? [1] https://mailarchive.ietf.org/arch/msg/last-call/4vEKZosvKvqJ9cufSm5ivsCho_A/ [2] https://url.spec.whatwg.org/#concept-host-parser -- last-call mailing list last-call@xxxxxxxx https://www.ietf.org/mailman/listinfo/last-call