(This is the first in what will hopefully be a series of review comments on the latest version of the dane-openpgp specification. I'm breaking this up into several different topics in hopes of keeping any resulting discussion focused on the particular set of issues I've brought up.) This is the first specification I'm aware that hashes the local-part of an address to produce a corresponding identifier. Not only have we never gone this far before, we've actually tried to stay away from operations like address comparisons that have similar, albeit more limited, semantics. In regards to this operation, there has been extensive discussion of the longstanding requirement that only agents with administative authority over the associated domain can "interpret" the local-part of an address. Unfortunately, AFAICT this discussion has completely missed two fundamental and vitally important points. First, there's no way to define a mapping of local-parts to a new set of identifiers *without* effectively interpreting the local-part! If you define the mapping as the draft currently does, implicit in that definition is that local-parts are case-sensitive. And similarly, if you convert the local-part to lower (or upper) case, you're now assuming the local-part is case-insensitive. And in the case of EAI, without some sort of normalization you're assuming that different UTF-8 representations of the same string of characters correspond to different recipients. (Which, as Harald Alvestrand and I both pointed out on the IETF list, is technically untenable and needs to be addressed. My suggestion was and is to specify that the same case-folding and normalization algorithm used for IDNs also be employed here.) But - and this is the second fundamental point that AFAICT has been missed - who is doing the interpreting? In one sense it's the consumer of the OPENPGPKEY records in the DNS, and the discussion so far has focused on how such consumers don't have the right to do that. But who published those records? That would be the owner of the domain - you know, the folks who *are* entitled to interpret the local-part of addresses in whatever fashion they choose. So when a domain owner publishes such records in the DNS, a reasonable way to look at it is that they are effectively saying, "Everyone is allowed to interpret the local-parts of our addresses as specified in this document in this one narrow context." I'm pretty confident there's nothing in any standard that forbids such a delegation of authority. And once you realize this is what is going on, not only does it become clear that this draft is *not* violating the longstanding rules about local-part interpretation, it casts the decision not to normalize the local-parts to lower (or upper) case in an entirely different light. By choosing not to normalize this specification is effectively restricting its own applicability to domains with case-sensitive local parts. That is, IMO, a highly suboptimal choice - the overwhelming majority of domains treat the local part in a case-insensitive fashion, and so should the mechanism specified in this draft. Or, to put this another way, the inherent limitations of using the DNS to provide the mapping from address to PGP key restricts the domain of applicability of this specification to domains with particular local-part policies, and the way in which the local-part to DNS mapping is specified determines which policies the specification supports. And while it seems logical to support a policy that's known to be in wide use, the specification also needs to be very clear that domains that employ case-sensitive local-parts MUST NOT avail themselves of this mechanism. What needs to happen here is that the specification be revised to make it clear that this is what is going on: That by publishing such records a domain is granting a limited right to interpret the local parts of its addresses. (One can of course argue that a specification that fails to offer a solution to case-sensitive domains, or to domains that employ various forms of subaddressing semantics, is unacceptable. But I am emphatically not making that argument. I have a number of grave reservations about this draft that I am going to try to explain in subsequent messages, but this isn't one of them.) There's also - as noted by Sean Leonard - a technical glitch in the current specification: The local-part is not the correct input to the hash function. A canonicalization step is needed because all of these addresses are equivalent: (1) first.last@xxxxxxxxxxx (2) first . last @example.com (3) "first.last"@example.com (4) "\f\i\r\s\t.last"@example.com (2) is equivalent to (1) because CWS has no semantics, (3) is equivalent to (1) because the enclosing quotes are not properly part of the address, and (4) is equivalent to (1) because quoted-pairs are semantically equivalent to just the quoted character. I believe this is the entire list, so the obvious canonicalization to use on the local-part portion of an address prior to lowercasing and hashing is: (a) If the local-part is unquoted remove any whitespace around periods. (b) Remove any enclosing double quotes. (c) Remove any literal quoting. I might be inclined to say that this rather technical matter can wait to be resolved in a future update, but (1) Implementations once deployed are difficult to change, and according to the draft there are already incompatible implementations out there and (2) Normalization need to be revisited anyhow, so why not fix this as well? Finally, a couple of observations about terminology are in order. The current text covering the hashing of local-parts begins with: The user name (the "left-hand side" of the email address, called the "local-part" in the mail message format definition [RFC5322] and the local-part in the specification for internationalized email [RFC6530]) is encoded in UTF-8 (or its subset ASCII). If the local-part is written in another encoding it MUST be converted to UTF-8. First, the left hand side of an email address is not a "user name" and should not be referred to as such. (The entire address is in some cases a "user name" of sorts, and in some cases the local-part is identical to some kind of login credential. But neither of these are universally true, and more to the point, none of this is relevant to the matter at hand.) Second, it probably makes sense to note that local-part is an ABNF production contained in a broader syntax, not just a name. Third, the term "encoding" here is inaccurate; it should be charset. That's all for now. Ned