Re: [decade] FW: Last Call: <draft-farrell-decade-ni-07.txt> (Naming Things with Hashes) to Proposed Standard

Jonathan A Rees <rees@xxxxxxxxxx> · Wed, 6 Jun 2012 16:33:36 -0400

As requested I am sending comments on this last call draft to
ietf@xxxxxxxx. I sent them to the authors on 6 May but received no
reply.

Jonathan Rees

---------- Forwarded message ----------
From: Jonathan A Rees <rees@xxxxxxxxxx>
Date: Sun, May 6, 2012 at 7:57 PM
Subject: comments on http://tools.ietf.org/html/draft-farrell-decade-ni-06
To: Alexey Melnikov <alexey.melnikov@xxxxxxxxx>, Barry Leiba
<barryleiba@xxxxxxxxxxxx>, "S. Farrell" <stephen.farrell@xxxxxxxxx>,
"P. Hallam-Baker" <pbaker@xxxxxxxxxxxx>

Here are some opinions on
http://tools.ietf.org/html/draft-farrell-decade-ni-06 :

I think this URI scheme would be a welcome addition to web
architecture. Wide review should be sought, because this might become
quite important and if there are problems they will be very difficult
to fix later.

I think using .well-known is a good idea.

I think integration into the ecosystem, such as browser support,
should be anticipated; for this reason I think content type should be
elevated from an 'optional feature' to a 'required feature'.

[i.e. conformant implementations must support it, even if providing
the content type in the URI is itself optional.]

If you
don't do this, you are just encouraging sniffing and privilege
escalation attacks. Sniffing would be a big step backwards. Better to
do what the data: scheme does and say that there is a default content
type of, say, text/plain, and that otherwise the content type ought to
be specified in the URI.

Content-type privilege escalation risk (and incorrect sniffing risk)
should be mentioned in the security considerations section in any
case.

Maybe the risk that the host used for retrieval might be spoofing the
content-type (by providing a bogus content-type in an HTTP response)
should also be mentioned. (A possible design would be to put the
content-type (and maybe other headers like Expires:?) in the hashed
content, to be pulled out into the HTTP response when the content is
served by an http server and then checked by the client, but I
understand that this would be a tooling headache.)

(I don't understand why you want to separate the 'optional' features
into a separate spec. This made me miss the ct= feature entirely at
first.)

I think the documentation should say that the hash and content type
together identify the resource, and that because the content can be
verified, the resource can be sought (using the .well-known path, or
any other path for that matter) from any source that the client thinks
might have it. The primary and alternate domain name(s), and 'wrapped'
URLs, are only provided as hints.

I agree with other commenters on the peculiarity of using // to
provide the location hint since the named host is not being trusted as
an authority. I don't understand why the 'primary' location isn't just
encoded in the query, just like the alternate domain(s) and "wrapped
URL(s)".  This would have the nice property that you can put the
identifying parts (i.e. hash and content type) first, and the less important
location hints parts all together after the identification. The various
location hints (whether primary or secondary) would go together and
their similarity would be clearer.

(Unless I'm misunderstanding something and the part after the //
actually has status other than a hint?  That would seem to defeat
the purpose.)

Jonathan