RE: Second Last Call: <draft-hammer-hostmeta-16.txt> (Web Host Metadata) to Proposed Standard -- feedback

"Paul E. Jones" <paulej@xxxxxxxxxxxxxx> · Wed, 22 Jun 2011 12:04:30 -0400

Mark,

> Generally, it's hard for me to be enthusiastic about this proposal, for
> a few reasons. That doesn't mean it shouldn't be published, but I do
> question the need for it to be Standards Track as a general mechanism.

I believe standards track is appropriate, since the objective is to define
procedures that are interoperable and the specification defines a set of
procedures that would be implemented by multiple software products.

> Mostly, it's because I hasn't really seen much discussion of it as a
> general component of the Web / Internet architecture; AFAICT all of the
> interest in it and discussion of it has happened in more specialised /
> vertical places. The issues below are my concerns; they're not
> insurmountable, but I would have expected to see some discussion of them
> to date on lists like this one and/or the TAG list for something that's
> to be an Internet Standard.

You might be right that more discussion has happened off the apps-discuss
list, but I would not equate that with not being a component of the web
architecture.  On the contrary, host-meta has a lot of utility and is an
important building block for the web architecture.  With host-meta, it is
possible to advertise information in a standard way, discover services, etc.
Some of the latter is not fully defined, but cannot be defined without this
standard in place.

> * XRD -- XRD is an OASIS spec that's used by OpenID and OAuth. Maybe I'm
> just scarred by WS-*, but it seems very over-engineered for what it
> does. I understand that the communities had reasons for using it to
> leverage an existing user base for their specific user cases, but I
> don't see any reason to generalise such a beast into a generic
> mechanism.

XRD is not complicated.  It's an XML document spec with about seven elements
defined.  In order to convey metadata, one must have some format defined and
XRD is as good as any other.  I don't think the use of XRD should be
considered as negative aspect.  OpenID uses (through Yadis) a precursor to
XRD called XRDS. I'm not sure about Oauth's usage of XRD.  Either way, does
this matter?

> * Precedence -- In my experience one of the most difficult parts of a
> metadata framework like this is specifying the combination of metadata
> from multiple sources in a way that's usable, complete and clear.
> Hostmeta only briefly mentions precedence rules in the introduction.

I assume you are referring to the processing rules in 1.1.1?  How would you
propose strengthening that text?

> * Scope of hosts -- The document doesn't crisply define what a "host"
> is.

This might be deliberate and not really fault of this document.  The
"hostname" that we are all used to using for a "host" may or may not refer
to a physical host.  It might refer to a virtual host or a virtually hosted
domain.  In any case, this term is consistent with the term used on the HTTP
spec and the header line "Host:".

> * Context of metadata -- I've become convinced that the most successful
> uses of .well-known URIs are those that have commonality of use; i.e.,
> it makes sense to define a .well-known URI when most of the data
> returned is applicable to a particular use case or set of use cases.
> This is why robots.txt works well, as do most other currently-deployed
> examples of well-known URIs.
> 
> Defining a bucket for potentially random, unassociated metadata in a
> single URI is, IMO, asking for trouble; if it is successful, it could
> cause administrative issues on the server (as potentially many parties
> will need control of a single file, for different uses -- tricky when
> ordering is important for precedence), and if the file gets big, it will
> cause performance issues for some use cases.

All of the use cases are not defined, but the host-meta document provides
some examples, such as finding the author of a web page, copyright
information, etc.  There has been discussion of finding a user's identity
provider.  The particular uses.  Each of these examples fits well within the
host-meta framework.  It builds upon the "web linking" (RFC 5988) work you
did in a logical and consistent way, and I see these as complementary
documents.  To your concern, host-meta is flexible, but the functionality is
bounded.

> * Chattiness -- the basic model for resource-specfic metadata in
> hostmeta requires at least two requests; one to get the hostmeta
> document, and one to get the resource-specific metadata after
> interpolating the URI of interest into a template.

This is true, but the web is all about establishing links to other
information.  I view this is a "good thing" about host-meta: it provides a
very simple syntax with a way to use well-defined link relation types to
discover other information.

> For some use cases, this might be appropriate; however, for many others
> (most that I have encountered), it's far too chatty. Many use cases find
> the latency of one extra request unacceptable, much less two. Many use
> cases require fetching metadata for a number of distinct resources; in
> this model, that adds a request per resource.

I think this comes back to use whatever makes most sense for the job.
Consider robots.txt for example.  We could define a "robots" link relation
and query the robots.txt file in a two-step operation, or we can just fetch
it from the well-known location.  A well-known location works fine for
robots.txt, but it does not work well for everything.  The examples of
fetching copyright information or author information are examples.  The
point is that host-meta provides a mechanism for discovering information
about a URI.  This is very useful, IMO.

> I'd expect a general solution in this space to allow describing a "map"
> of a Web site and applying metadata to it in arbitrary ways, so that a
> client could fetch the document once and determine the metadata for any
> given resource by examining it.

But what of URIs that do not refer to a document?  One could learn a lot of
metadata about a specific document following RFC 5988, but that does not
work for URIs like "mailto:paulej@xxxxxxxxxxxxxx"; or any non-'http[s]' URI
scheme used.  Even for some URIs that use http[s] as the scheme, there may
or may not be an associated document.  Host-meta could allow one to get
meta-data about any valid URI for a domain.

> If hostmeta is designed for specific use cases and meets them well,
> that's great, but it shouldn't be sold as a general mechanism. So, I'm -
> 1 on this going forward as a standards-track general mechanism. I
> wouldn't mind if it were Informational, or if it were Standards-Track
> but with a defined use case.

Perhaps one could have made the same kind of argument about HTTP.  It is a
general mechanism for fetching documents, files, or any data from a
location.  This document is similar, but with a focus on providing metadata
about URIs.  I think the procedures are useful and the document should go
forward as a standards track RFC.

Paul

_______________________________________________
Ietf mailing list
Ietf@xxxxxxxx
https://www.ietf.org/mailman/listinfo/ietf