Re: Second Last Call: <draft-hammer-hostmeta-16.txt> (Web Host Metadata) to Proposed Standard -- feedback

Mark Nottingham <mnot@xxxxxxxx> · Thu, 23 Jun 2011 09:18:18 +1000

On 23/06/2011, at 2:04 AM, Paul E. Jones wrote:

> Mark,
> 
>> Generally, it's hard for me to be enthusiastic about this proposal, for
>> a few reasons. That doesn't mean it shouldn't be published, but I do
>> question the need for it to be Standards Track as a general mechanism.
> 
> I believe standards track is appropriate, since the objective is to define
> procedures that are interoperable and the specification defines a set of
> procedures that would be implemented by multiple software products.

That can be said of pretty much every specification that comes along; does this imply that you think everything should be standards track?

At the end of the day, it's standards track if the IESG says it is. They asked for feedback on the Last Call, and I gave mine. It's not the end of the world if this becomes Standards Track, but I felt that it shouldn't pass without comment.

>> Mostly, it's because I hasn't really seen much discussion of it as a
>> general component of the Web / Internet architecture; AFAICT all of the
>> interest in it and discussion of it has happened in more specialised /
>> vertical places. The issues below are my concerns; they're not
>> insurmountable, but I would have expected to see some discussion of them
>> to date on lists like this one and/or the TAG list for something that's
>> to be an Internet Standard.
> 
> You might be right that more discussion has happened off the apps-discuss
> list, but I would not equate that with not being a component of the web
> architecture.  

... and I didn't equate it with that either; I said it was concerning that it hadn't been discussed broadly.

> On the contrary, host-meta has a lot of utility and is an
> important building block for the web architecture.  With host-meta, it is
> possible to advertise information in a standard way, discover services, etc.
> Some of the latter is not fully defined, but cannot be defined without this
> standard in place.

A "lot of utility" and being "an important building block" are completely subjective, of course. I'd agree with a statement that it's an important building block of OAuth, for example, but it seems quite premature to call it an important building block of the Web arch. 

>> * XRD -- XRD is an OASIS spec that's used by OpenID and OAuth. Maybe I'm
>> just scarred by WS-*, but it seems very over-engineered for what it
>> does. I understand that the communities had reasons for using it to
>> leverage an existing user base for their specific user cases, but I
>> don't see any reason to generalise such a beast into a generic
>> mechanism.
> 
> XRD is not complicated.  It's an XML document spec with about seven elements
> defined.  In order to convey metadata, one must have some format defined and
> XRD is as good as any other.  I don't think the use of XRD should be
> considered as negative aspect.  OpenID uses (through Yadis) a precursor to
> XRD called XRDS. I'm not sure about Oauth's usage of XRD.  Either way, does
> this matter?

Choosing your foundations well matters greatly.

>> * Precedence -- In my experience one of the most difficult parts of a
>> metadata framework like this is specifying the combination of metadata
>> from multiple sources in a way that's usable, complete and clear.
>> Hostmeta only briefly mentions precedence rules in the introduction.
> 
> I assume you are referring to the processing rules in 1.1.1?  How would you
> propose strengthening that text?

It's not a matter of strengthening the text, it's a matter of agreeing upon and defining an algorithm. As it sits, the document doesn't do much more than wave its hands about precedence.

>> * Scope of hosts -- The document doesn't crisply define what a "host"
>> is.
> 
> This might be deliberate and not really fault of this document.  The
> "hostname" that we are all used to using for a "host" may or may not refer
> to a physical host.  It might refer to a virtual host or a virtually hosted
> domain.

You use "might" a lot here. Do you know what it is, or are you just speculating?

> In any case, this term is consistent with the term used on the HTTP
> spec and the header line "Host:".

The Host header field conveys a host and a port, where this document seems to attach a very ephemeral concept to the term.

>> * Context of metadata -- I've become convinced that the most successful
>> uses of .well-known URIs are those that have commonality of use; i.e.,
>> it makes sense to define a .well-known URI when most of the data
>> returned is applicable to a particular use case or set of use cases.
>> This is why robots.txt works well, as do most other currently-deployed
>> examples of well-known URIs.
>> 
>> Defining a bucket for potentially random, unassociated metadata in a
>> single URI is, IMO, asking for trouble; if it is successful, it could
>> cause administrative issues on the server (as potentially many parties
>> will need control of a single file, for different uses -- tricky when
>> ordering is important for precedence), and if the file gets big, it will
>> cause performance issues for some use cases.
> 
> All of the use cases are not defined, but the host-meta document provides
> some examples, such as finding the author of a web page, copyright
> information, etc.  There has been discussion of finding a user's identity
> provider.  The particular uses.  Each of these examples fits well within the
> host-meta framework.  It builds upon the "web linking" (RFC 5988) work you
> did in a logical and consistent way, and I see these as complementary
> documents.  To your concern, host-meta is flexible, but the functionality is
> bounded.

That's almost a meaningless statement.

>> * Chattiness -- the basic model for resource-specfic metadata in
>> hostmeta requires at least two requests; one to get the hostmeta
>> document, and one to get the resource-specific metadata after
>> interpolating the URI of interest into a template.
> 
> This is true, but the web is all about establishing links to other
> information.  I view this is a "good thing" about host-meta: it provides a
> very simple syntax with a way to use well-defined link relation types to
> discover other information.

Again, you really haven't addressed the concern.

>> For some use cases, this might be appropriate; however, for many others
>> (most that I have encountered), it's far too chatty. Many use cases find
>> the latency of one extra request unacceptable, much less two. Many use
>> cases require fetching metadata for a number of distinct resources; in
>> this model, that adds a request per resource.
> 
> I think this comes back to use whatever makes most sense for the job.
> Consider robots.txt for example.  We could define a "robots" link relation
> and query the robots.txt file in a two-step operation, or we can just fetch
> it from the well-known location.  A well-known location works fine for
> robots.txt, but it does not work well for everything.  The examples of
> fetching copyright information or author information are examples.  The
> point is that host-meta provides a mechanism for discovering information
> about a URI.  This is very useful, IMO.

It's certainly useful for the cases that people are currently using it for, yes. It's less clear to me that it should be promoted as a general mechanism.

>> I'd expect a general solution in this space to allow describing a "map"
>> of a Web site and applying metadata to it in arbitrary ways, so that a
>> client could fetch the document once and determine the metadata for any
>> given resource by examining it.
> 
> But what of URIs that do not refer to a document?  One could learn a lot of
> metadata about a specific document following RFC 5988, but that does not
> work for URIs like "mailto:paulej@xxxxxxxxxxxxxx"; or any non-'http[s]' URI
> scheme used.  Even for some URIs that use http[s] as the scheme, there may
> or may not be an associated document.  Host-meta could allow one to get
> meta-data about any valid URI for a domain.

Sure. Again, I didn't say that there weren't uses for hostmeta, just that it's potentially inappropriate for lots of common ones.

>> If hostmeta is designed for specific use cases and meets them well,
>> that's great, but it shouldn't be sold as a general mechanism. So, I'm -
>> 1 on this going forward as a standards-track general mechanism. I
>> wouldn't mind if it were Informational, or if it were Standards-Track
>> but with a defined use case.
> 
> Perhaps one could have made the same kind of argument about HTTP.  It is a
> general mechanism for fetching documents, files, or any data from a
> location.  This document is similar, but with a focus on providing metadata
> about URIs.  I think the procedures are useful and the document should go
> forward as a standards track RFC.

Wow, you're really stretching there, aren't you?

--
Mark Nottingham   http://www.mnot.net/

_______________________________________________
Ietf mailing list
Ietf@xxxxxxxx
https://www.ietf.org/mailman/listinfo/ietf