Re: Last Call: An IETF URN Sub-namespace for Registered Protocol Parameters to BCP

Graham Klyne <GK@NineByNine.org> · Thu, 04 Jul 2002 00:34:49 +0100

Keith,

we could bat these arguments back and forth without making any further 
progress.  I think I have real use-cases for this proposal for which the 
concerns you raise just don't seem to be a problem.  On the other hand, I 
can recognize a legitimate concern with:

[[
>But neither do I want to give official blessing to folks to re-cast
>traditional IETF protocols into new syntactic forms.  And a lot of
>the interest I've seen in having URI equivalents for IETF protocol
>parameter names was from people who wanted to do just that - often
>with the explicit intent of producing variant implementations in order
>to disrupt the installed base.
]]

I think a more constructive way forward would be to work on an appropriate 
form of words indicating the concerns and purposes for which these IETF 
URNs SHOULD NOT be used.

I perceive that your concerns relate mostly to attempts to transplant 
entire protocol structures from one framework to another, and in those 
cases I agree that it is difficult to transfer all the semantics 
faithfully.  But this proposal is aimed more at using information about 
protocol elements, the kind of detail that is described in a registry 
entry, into a different application environment.

Examples:

IETF CONNEG and W3C CC/PP have quite different overall structure and 
semantics, but they can still usefully employ common media feature 
definitions.  This is clearly not an attempt to recast CONNEG into XML, but 
it seems highly desirable to not end up with two almost parallel sets of 
media features when just one set would do fine.

My other example, which may seem closer to your concern but which really is 
not, is to do with storing and processing message metadata.  Here, I want 
to be able to use URIs to identify the various header fields that appear in 
a message.  Again, the goal is not to re-cast an IETF protocol in a 
different form, but to take information from an IETF protocol for use in a 
different application.

#g
--

PS:  as far as I can tell, according to 
http://www.ietf.org/internet-drafts/draft-sun-handle-system-def-05.txt (for 
which I saw the announcement after sending my previous message), the handle 
system does operate federation of naming authorities:

         <Handle>          = <NamingAuthority> "/" <LocalName>

         <NamingAuthority> = *(<NamingAuthority>  ".") <NAsegment>

         <NAsegment>       = 1*(%x00-2D  /  %x30-3F / %x41-FF )
                           ; any octets that map to UTF-8 encoded
                           ; Unicode 2.0 characters except
                           ; octets '0x2E' and '0x2F' (which
                           ; correspond to the ASCII characters '.',
                           ; and '/').

         <LocalName>       = *(%x00-FF)
                           ; any octets that map to UTF-8 encoded
                           ; Unicode 2.0 characters

and:

     Naming authorities are defined in a hierarchical fashion resembling
     a tree structure. Each node and leaf of the tree are given a label
     that corresponds to a naming authority segment (<NAsegment>). The
     parent node represents the parent naming authority. Naming
     authorities are constructed left to right, concatenating the labels
     from the root of the tree to the node that represents the naming
     authority. Each label (or its <NAsegment>) is separated by the
     character '.' (octet 0x2E). For example, the naming authority for
     the Digital Object Identifier (DOI) project is  "10". It is a root-
     level naming authority as it has no parent naming authority for
     itself. It can, however, have many child naming authorities, e.g.,
     "10.1045" which is used as a naming authority for D-Lib Magazine.

At 05:02 PM 7/3/02 -0400, Keith Moore wrote:
> > > > Spurred by XML and related technologies (which I assert are far 
> more than
> > > > mere "fashion") we are seeing URIs used for a wide range of 
> purposes which
> > > > are not constrained by a requirement for dereferencing.   The use 
> of URIs
> > > > for identifying arbitrary things is now a fact of life, and in some
> > > > technical domains is providing to be extremely useful.  You claim 
> "harm",
> > > > but I recognize no such harm.
> > >
> > >Clarification: I claim "harm" for the proposed use of *URNs* because
> > >URNs were designed to be long-term stable names for (at least potentially)
> > >network-accessible resources, whereas the proposal is to use them as a
> > >way of generating globally unique strings like UUIDs or OIDs.
> >
> > I still don't see the "harm" here.
>
>basically, it's trivializing them.  they're overkill for this purpose, and
>using URns for this purpose makes them seem less useful than they really
>are.
>
>that, and I think there'll be a very strong demand for them to be 
>human-readable
>(i.e. to have visible structure) and syntactically derivable from the 
>canonical
>name for the protocol elements (for those that have such names).
>
> > >I'm all for reuse of data models where it makes sense, but if the goal
> > >is really to "lock the various syntactic forms to a common semantic
> > >definition" (presumably one which is compatible with XML) then I take
> > >strong issue with that, as the XML model is quite dysfunctional for
> > >many purposes.  (as are the others, it's just that XML is the current
> > >bandwagon)
> >
> > I'm puzzled -- you appear to be arguing my point.  Yes, different 
> syntactic
> > frameworks will (in isolation) tend to yields differing semantics.  Yes,
> > different syntactic frameworks are better suited for different
> > purposes.  But it seems to me that referring different uses to the same
> > original definition would help to inhibit that -- and if factors like
> > ordering or grouping are significant, then the definition will (hopefully)
> > capture that and place constraints on the syntactic contexts for re-use.
>
>I just don't happen to share your faith in this as a mechanism to inhibit
>or discourage semantic drift.  In every example I can think of where one
>data model is exported into a different context there has been semantic
>drift, even when the same names and official definitions were retained.
>(maybe there's less drift this way, maybe not - but it certainly doesn't
>inhibit drift)
>
> > >Using URIs for the names of the data elements won't stop that kind of 
> drift.
> >
> > But not trying to re-use existing definitions seems to be a recipe for
> > Balkanization.
>
>I don't know how to avoid Balkanazation.  Sometimes it seems better to
>let data models fork rather than to try to reconcile various differences -
>I'd cite RFC822, usenet, HTTP, and SIP as a good example of things that
>we shouldn't pretend have the same protocol elements even though
>we recognize that they share a common ancestry.
>
> > Maybe it won't work for all applications, but I think there are a
> > substantial number of cases where re-use of existing definitions is a
> > reasonable and desirable goal.
>
>I don't claim that re-use of a data model is not potentially useful.
>If nothing else, an existing data model can serve as a useful starting
>point for a new data model when the requirements or syntactic structures
>dictate not using the old one.
>
>But neither do I want to give official blessing to folks to re-cast
>traditional IETF protocols into new syntactic forms.  And a lot of
>the interest I've seen in having URI equivalents for IETF protocol
>parameter names was from people who wanted to do just that - often
>with the explicit intent of producing variant implementations in order
>to disrupt the installed base.
>
> > >But neither do we have to endorse it just so they will use our stuff.
> > >Especially when their using our stuff dilutes the utility of our stuff
> > >by not requiring widespread agreement on the media features used.
> >
> > Come again?  That seems to me to be entirely non-sequitur.  How can other
> > people using out stuff dilute its utility?  It is precisely in the nature
> > of this proposal that using these URIs would be assenting to the IETF
> > definition of their meaning.
>
>no it's not, because of the semantic drift that will occur.
>
>Someone once tried to demonstrate to me that it was perfectly reasonable
>to express iCalendar events in XML - but her demonstration used XML's
>date representation which didn't have a proper concept of timezones.
>Interpretation of dates in iCalendar were dependent on a separate timezone
>element, whereas the XML tool wanted to treat those dates as standalone.
>so the "obvious" conversion of iCalendar to XML - even though the elements
>mapped one-to-one - caused semantic drift and a loss of important
>functionality.
>
> > > > This URN namespace proposal will provide a way to incorporate
> > > > the IETF feature registry directly into the W3C work, in a way which is
> > > > traceable through IETF specifications.   Without this, I predict 
> that the
> > > > parties who are looking to use the W3C work (notably, mobile phone
> > > > companies) will simply go away and invent their own set of media 
> features,
> > > > without any kind of clear relationship to the IETF features.
> > >
> > >The w3c approach is encouraging them to do this anyway, by having
> > >all media features be URIs that anyone can create/assign without any
> > >agreement from anyone else.
> >
> > So we should roll over and play dead, and pretend that interoperability
> > doesn't matter?
>
>It's not clear that doing things the w3c way helps interoperability.
>
> > Actually, that's a misrepresentation of the W3C position, which is that
> > vocabularies gain currency through use -- the more people who use them, 
> the
> > more useful, and more widely used they become.
>
>That's true to a point, but it also seems to be the case that controlled
>vocabularies that need to have consistent meaning across large groups
>need very careful definition and, well, "control".  Natural languages,
>by contrast, tend to drift continuously.  Sometimes that's useful, but
>perhaps not as useful for computer protocols as for humans that can
>intuitively accomodate a certain amount of semantic skew.
>
> > >The likely consequence of what is being proposed is for the URIs that we
> > >define to mean nearly, but not quite, the same thing as an IETF protocol
> > >parameter - but we have to try to pretend that they mean the same thing.
> > >And it will degrade interoperability.
> >
> > Er, no:  we *define* them to mean the *same* thing.  If implementations
> > play fast and loose with the defined meaning, that's nothing new.
>
>At the same time, by explicitly exporting them we are encouraging
>semantic drift.
>
> > >The very temptation to treat URNs as if they were as malleable as other
> > >URIs is part of what makes this proposal dangerous.  Since I think that
> > >URNs *will* be widely misused if they are used for protocol elements,
> > >I'd far rather have IANA assign ordinary URIs for this - then we will
> > >still get semantic drift but at least it won't dilute the value of URNs.
> >
> > In what sense are URNs not ordinary URIs?  They have particular
> > requirements for persistence that are not shared by all URI schemes.
>
>In order to make a URN persistent you really need to make them opaque
>(or mostly so) to humans.   It's really too bad that we even allowed
>URN namespace IDs to be human-meaningful, but that's water under the bridge.
>
> > > > (i) have a framework for assigning identifier values, in such a way
> > > that it
> > > > is possible by some means for a human to locate its defining
> > > > specification.  I can't see how to do this without exploiting a visible
> > > > syntactic structure in the name.
> > >
> > >ISBNs do not have a visible syntactic structure, at least, not an
> > >obvious one.  But they're quite frequently used to look up book 
> information.
> >
> > I understand that ISBNs aren't persistent -- they get reused.
>
>They're not supposed to be, but it does happen in some countries -
>particularly those with less ISBN space allocated to them.
>So we have a NAT-like problem for ISBNs ...
>
> > Anyway, ISBN's *do* have an internal syntactic structure.
>
>I didn't say they didn't have one, I just said it was not obvious.
>
> >
> > > > (ii) have a framework for actually using the identifier in an
> > > > application:  in this case, I agree that the identifier should
> > > generally be
> > > > treated as opaque.
> > > >
> > > > Also, I think (d) contradicts your goal (a):  I cannot conceive any
> > > > scalable resolution mechanism that does not in some sense depend on
> > > > syntactic decomposition of the name.
> > >
> > >You should really read up on the CNRI handle system then.  There are a lot
> > >of things I don't like about it but it really was designed to have exactly
> > >this property.
> >
> > Based on a December 2001 article
> > (http://www.dlib.org/dlib/december01/blanchi/12blanchi.html), it seems to
> > me that Handles too depend on some syntactic structure to partition the
> > search space -- based on dynamic content types and metadata schema.
>
>Handles have evolved a bit since first envisioned - as I understand it the
>problem wasn't the inability of the non-partitioned search service to scale
>up to the number of queries but rather the difficulties associated with
>everybody trusting a centrally maintained flat search service.
>
>Someone from cnri might be able to fill in more detail.
>
> > Ah yes, and according to the internet draft on handles:
> >    http://www.ietf.org/internet-drafts/draft-sun-handle-system-09.txt
> > there *is* a clear syntactic structure:
>
>Yes, but the searching isn't (didn't used to be) federated according to that
>structure.  The scalability of the searching didn't depend on it -
>federating actually slowed things down unless you happened to consult the
>right server first.  (locality does affect search speed)
>
> > But I think the general idea still holds here -- if you
> > want to reliably and quickly dereference an identifier with Internet 
> scope,
> > it cannot be completely opaque.)
>
>Hashing is faster than tree searching, especially if the tree is distributed.
>you federate the lookup because of trust issues (which are a kind of scaling
>issue, but not in terms of bandwidth or cpu cycles) and ease-of-cost-recovery
>issues, not to make the lookup more efficient or cheaper.
>
>Keith

-------------------
Graham Klyne
<GK@NineByNine.org>