--On Sunday, 05 July, 2015 08:00 +1200 John R Levine <johnl@xxxxxxxxx> wrote: >> Since DOIs are opaque, that doesn't preclude future use of a >> numeric prefix as well for something completely different. > > Right. Now I see what the problem is -- opaque identifiers > are jargon from databases (I did my PhD research on them) and > many IETFers don't understand what they are. > > The point of an opaque identifier is that you can make no > assumptions whatsoever about what its structure or format is. > You can just find it somehow, and you can hand it back to > services that use it to look up what it refers to. >... > I hope it's now clear why that would be a bad idea, and also > why you shouldn't make any assumptions about what the DOIs of > RFCs after RFC9999 will be. And the problem with opaque identifiers that happen to be construed in a consistent way that humans can deduce related to something else is that they will be used that way. To be clear about what follow, I'm far more concerned about the case of "given DOI, find document" than I am about "given document bibliographic reference, find and write down the DOI". To use his example, Andy knows how to get from 10.1364/JOCN.4.000001 to a particular journal, volume, and page number and he is almost certainly going to do that by using the algorithm in his head. Unless he is much more compulsive and has far more time on his hands than I believe is the case, he is unlikely to go to a DOI resolver system for each of the DOIs in that form, resolver systems that will just tell him what he knew already and that may point him through more indirection than he would find useful if he wants the document. You may know it is formally an opaque identifier, he may know it is formally an opaque identifier, but, in practice, he knows there is nothing opaque about it and that it would be silly to pretend. Now, suppose the journal publisher decides that, from volume 8 onward, they are going to add 100 times the first three digits of Pi to the volume number field, eliminate the separating dot, and add an extra zero to the page number field in case they were to publish a lot that year. Assuming I got the arithmetic right, hat would give the next volume's first article a DOI of 10.1364/JOCN.3320000001 Andy (and hundreds of other readers who are behaving the same way) would, I assume, be at least mildly irritated because the new format is different and requires that they do something different and perhaps a bit harder. The curators of the system would, no doubt, say "opaque identifier, you had no right to the expectation that you could parse the DOI and translate it into a volume and page number, and still don't have that right even if you can figure out the new algorithm". Both would be correct in their positions, but knowing that is not especially helpful. Now assume that, a few years in the future and perhaps to commemorate volume 10 and to educate their readers, the identifier were changed again to eliminate "JOCN." and the algorithm and, instead, use some form of a hash on the article contents (plus a database lookup to check for uniqueness and a way to adjust if needed). Again, it is an opaque identifier, so Andy and his colleagues have, in the eyes of the identifier-assigning folks and your comments above, no basis for complaint even though this new identifier format forces them to do a database lookup for each DOI. From their point of view that is unreasonable. When publisher 1364 also raises the subscription rates to cover the cost of the high performance and redundant DOI servers that were not needed before because most readers knew the algorithm and skipped the lookup, it would seem even more unreasonable. On the other hand, the DOI-assigning database administrators just say "opaque identifier, what are you complaining about". And, again, both are right. I think there are clear human factors preferences about whether the use cases or the opaque identifier argument prevails, but YMMD. Now, restated more precisely and in the light of your explanation, I objected (and object) to the choice of "rfc1149" because it provides an opportunity for confusion if the format of the identifier is changed in the future and because the use of ASCII characters, especially as part of what will be perceived as a field, rather than a dot-separated subfield, may be inconvenient to our growing international community for no good reason. That doesn't mean "can't be changed in the future" and I apologize for anything I may have said that was interpreted that way. But saying "opaque identifier" does not make changes less inconvenient and disruptive for the reasons described above. Now my prediction, for reasons that parallel Melinda's comments, is that no active participant in the IETF will ever (other than experimentally or for demonstration purposes) locate and retrieve an RFC, especially an RFC referenced from another RFC, by looking up the DOI. That probably means that, with the exception of symbolic value and a tiny number of readers, including the DOI for an RFC in references within RFCs is a waste of time and bits, but see below. Because we won't use them, an opaque identifier, even 10.17487/gazornplatz, should be just fine for direct IETF purposes. If the goal for assigning DOIs is symbolic, i.e., not that we expect anyone to use them for find RFCs but to impress some group of people with the fact that we have and assign DOIs because doing so adds prestige or credibility. But, the more we rely on the assumption that RFC-related DOIs won't be used to find RFCs, the more important it is that the suffixes be structured in a well-known, obvious, and stable way in practice, even if they are opaque identifiers in DOI theory. best, john