Apologies for being a little laggy -
Some comments inline.
On 8/31/17 2:07 PM, Eliot Lear wrote:
Robert,
As I wrote earlier, this was a great review. Thanks for that.
Please see below.
On 8/30/17 7:21 PM, Robert Sparks
wrote:
Reviewer: Robert Sparks
Review result: Almost Ready
This is an exciting concept, and the draft overall is approachable. I
have identified a few areas I think need more detail, and have a
longish list of nits (please don't take that to be negative).
==Issues==
I find the structure of the introduction unclear. Please consider
reworking it. I would suggest even more succinctly listing goals and
constraints, and then intended applicability (these things are in the
current text, but I think you can render them much more efficiently). In
particular, the argument that implementers of things are incented only to
provide the minimal amount of behavior to get their thingyness could be
more strongly highlighted.
I've received conflicting reviews here. Most like what's there
and while I'm open to specific textual changes, a full
reorganization would likely be destabilizing.
I disagree. I know it's work, but if you read this as an
implementer, I think it would pay back.
This is a stylistic suggestion though, so take it only as my
opinion.
The document proposes "reputation services". It needs more words about
whether those exist, and what scopes the architecture imagines (an
enterprise might have a different idea of a reputation service than a
residence). There is a notion of "decent web reputations" in the security
considerations section. Who determines that? The security considerations
section should talk about attacks against the reputation services.
This is discussed in security considerations:
It may also be useful
to limit retrieval of MUD URLs to only those sites that are
known to
have decent web reputations.
What I am specifically talking about are web or domain reputation
services. These are pretty commonplace today. Your browser uses
one, and numerous companies offer them, including our own. But to
be clearer, I propose to use the term "web or domain reputation
services", so that people know what I'm talking about.
I'll look at the diff.
In the first paragraph of Section 2, it's not clear if you are trying
to restrict the models to only those in the two documents in the list
following the paragraph.
Right. This was caught earlier, and I'll correct.
I am not a YANG doctor, so this may be in the weeds, but it feels like
there's a discrepancy between the diagram at the end of section 2 and the
element definitions in section 3. In particular 3.7 doesn't seem to align
with what the diagram or the example in Appendix B uses. Should you be
defining "from-device-policy" and "to-device-policy" instead of
"packet-direction"? (I'm wondering if 3.7 reflects an older design?)
Yupper. Fixed (I think).
At section 3.13, the description of my-controller is not quite right.
This bit signals to the mud controller to use a mapping that it knows
about or creates. Something else established that class (and maybe gave
it a name). I talked about this with Eliot and he has a better
description to use.
Proposed text:
This null-valued node signals to the MUD
controller to use whatever
mapping it has for this URL to a controller class". This may
require
prompting the administrator for class members. Future work
should
seek to automate membership management.
ack
It's not clear to me that this is a good use of .well-known. I suggest
getting an expert review on the proposed usage. (I had a quick
conversation with Mark Nottingham and got some initial feedback that
I'm passing along here. I'm sure there's more that an in-depth review
would identify.) Why wouldn't a URI template (RFC6570) do the job?
Rather than use RFC3986's query, consider pointing to HTML5 (which
would bring the more familiar key=value format).
The key issue is that we want to externalize versioning AND
hardcode it in the URL so that it's independent of transport.
Remember, there is very little information exchange between the
Thing and the network, and I will claim that's a good thing.
Sure, and URL templates would _do_ that, no? But please take that
argument up with folks like Mark.
The document needs to say more about how HTTP is used. I assume you only
intend to use GET, and that you expect redirects to be followed, and that
nothing special needs to be considered with caching? The document needs
to be explicit about it. Take a look at
<https://mnot.github.io/I-D/bcp56bis/>. (There's been some conversation
about it on the art list, so Eliot, at least, is already aware of it -
see <https://mailarchive.ietf.org/arch/search/?q=bcp56bis>)
What we say today is the following;
Processing of this
URL occurs as specified in {{RFC2818}} and {{RFC3986}}.
There is one aspect of caching semantics we should probably
capture, which is that the cache-validity period should exceed the
HTTP cache or expiry period as specified by max-age or Expires.
Does that sound about right to you?
Goes in the right direction. Do you expect POST to work with this?
I think there needs to be more discussion of the PKI used for signing MUD
files.
We do have some discussion in Section 12.2. I'm happy to add an
additional sentence or two, but would seek guidance on where you
think we're missing.
So, are you expecting to reuse the web PKI here? Will the MUD files
be signed with the same credentials used by the HTTP server? I'm
thinking you aren't, and are waving your hands at where trust lies
with the recommendation that signers be validated directly etc.
Either way, I think you need to be more explicit and that what you
expect for establishing trust is going to take more than a couple of
sentences.
Consider discussing whether the stacks used by typical things will let
them add DHCP options (or include bits in the other protocols being
enabled). If it's well known (I can't say) that these stacks typically
_won't_ provide that functionality, then you should punch up the
discussion of the controllers mapping other identifiers to MUD URLs on
behalf of the thing.
I agree. We allude to this in the draft. We say, for instance:
It is possible that there may be other
means for a MUD URL to be
learned by a network. For instance, if a device has a serial
number,
it may be possible for the MUD controller to perform a lookup of
the
device, if it has some knowledge as to who the device
manufacturer
is, and what its MUD file server is. Such mechanisms are not
described in this memo, but are possible.
The case we have in mind is LoRaWAN. Should we go further?
I think explicitly acknowledging that some things stacks limit their
behavior will pay back. It would be unfortunate if someone who
started a MUD controller implementation made the assumption that the
majority of things will hand them the DHCP option (etc.) and waited
to bolt the complexity of the lookup above onto their initial
design.
You suggest the DHCP Client (which is a thing) SHOULD log or report
improper acknowledgments from servers. That's asking a bit much from
a thing. I suspect the requirement is unrealistic and should be removed
or rewritten to acknowledge that things typically won't do that.
I think there's a philosophical thing hiding here, though: what
expectations should we have of device. As a SHOULD we're saying,
if you have good cause not to, ok. But otherwise, for the sake of
the sanity of the customer, please log.
Why not acknowledge in the document that the expectation is that
most won't be able to.
Painting as explicit and accurate picture as possible can only do
good, no?
(Again, I'm not trying to _assert_ that the majority of things
can't, but that's my suspicion. People who work with this things on
a daily basis should weigh in.)
The security and deployment considerations sections talk about what the
need for coordination if control over the domain name used in the URL
changes. It should talk more about what happens if the new administration
of the domain is not interested in facilitating a transition (consider
the case of a young company with a few thousand start-up-ish things out
there that loses a suit over its name). Please discuss whether or not
suddenly losing the MUD assisted network configuration is expected to
leave the devices effectively cut-off.
It should not, and here's why:
- Assuming the device has already been used, there is no
reason to simply delete the MUD file from one's cache. The
cache-validity value is meant as a timer to keep
implementations for harassing the MUD file server, but there's
the information is still useful, even if it may not have been
freshened.
Hrmm - there should be some description about using the information
even if the cache has expired. That might have security
ramifications (it at least enables an attacker to cause a set of
devices to use old information by attacking the access to what might
be newer information).
- In the case where the MUD file service is unavailable when
the device is first turned up, it's as if it had not included
a MUD-URL in the first place. While this may be a downgrade
attack, there is, as I understand it, really no way to get
around it, other than for the MUD controller to log a problem.
Worth a short discussion in the text.
Right now, you leave the DHCP server (when it's used) responsible for
clearing state in the MUD controller. Please discuss what happens when
those are distinct elements (as you have in the end of section 9.2) and
the DHCP server reboots. Perhaps it would make sense for the DHCP server
to hand the length of the lease it has granted to the MUD controller and
let the MUD controller clean up on its own?
See other response.
The document currently suggests that a piece of software inspect the
WHOIS database to see if registration ownership of a domain has changed.
Do you really mean software, or should this be advice to the
administrator of the controller instead?
The controller. The idea is to catch bad behavior and anomalies.
And the bigger idea is to reduce the number of decisions that the
administrator must make, while providing relevant information with
which to make the decisions.
I don't think this is a reasonable thing to do. It has the many of
the same properties we complain about when someone suggest that code
inspect an IANA registry.
==Nits==
I recommend an editorial pass focusing on simplifying sentences. Look
particularly where the word "therefore" is used and consider
restructuring the surrounds. (It is used non-sequitur in a couple of
places). Be careful to call out actors explicitly (I note the places
that particularly caught my eye below).
ok.
Some specific nits:
The abstract speaks only about properties of MUD but does not describe
what MUD _is_, or is good for. A few more words here would help.
Right. See response to Henk.
Next to last paragraph of section 1 (before 1.1): A means for _who_ to
retrieve the description? (Consider rendering the three list elements on
their own lines.)
Right. Fixed.
The last sentence of section 1 treats "enterprise networks" more
specially than it intends, I think. Why couldn't _any_ network do this?
Could the sentence be reworded to make it clear that enterprise networks
are an example?
s/enterprise networks/local deployments/
?
Sure
First sentence of 1.1: Perhaps you mean "general purpose computing
devices" instead of "general computing"? "their" has an unclear
antecedent.
Indeed. fixed.
Last paragraph of 1.3: It's unclear what "such an approach" is intended
to point to. Would "a general solution that required capabilities their
particular device would not use" make more sense?
Reworded.
First paragraph of 1.5: "might to allow" is probably meant to be "might
be to allow". What does it mean for a controller to "need to speak COAP".
Do you mean "controllers capable of speaking COAP"?
Fixed.
Fourth paragraph of 1.5 at the discussion of time and effort: Consider
rephrasing this to focus on the result of the time and effort (high
quality) rather than the time and effort itself.
Existence is good enough in this case ;-)
In the list of abstractions at the end of 1.5, you have three things you
describe as devices and one thing you describe as a class. You later talk
about the abstractions you've described as devices as classes. At this
point in the document what you mean by "class" has not been made as
explicit as it could be.
I've tried to review all instances of class to be clear that it is
used consistently. This is somewhat difficult given natural
language, but I hope I've gotten it right.
Section 1.8, item 3: the MUD file doesn't have hosts in it (it has
identifiers of some kind). Consider being more explicit about what
you mean by testing that against a reputation service.
Actually, it can have hosts in it, but see above.
Section 3.1: You say "Which turn was taken". I think you meant
"Which, in turn, was taken". Consider deleting "for those keeping score".
Awwww. Just a bit of humor? ;-)
Section 3.3 is missing a word at "the location any MASA service"?
Ok it's cleaned up.
I found the prose in the descriptions of the "manufacturer" and
"same-manufacturer" elements (3.8 and 3.9) very confusing. I think
additional prose introducing the concepts and maybe some examples would
be very useful.
Added examples per your suggestion.
What do you mean by "matches" at 3.10. Do you mean "is"?
All of this is applicable in the context of the matches statement
in the ACL model. I've added some explanatory text at the
beginning of the chapeau.
The caution in the 2nd paragraph of 3.12 is not clear.
Ok, I've cleaned that up and added an example.
At section 4, consider pointing out that you are not allowing
DHCP by default, and that devices that are expected to use DHCP
need to have an explicit allow in their MUD file.
Hmm. The issue here is that DHCP is an L2 protocol that isn't
forwarded. Do you think it needs to be listed anyway?
Hmm indeed. You're right. That said, the thing that triggered the
thought
was the ability of a MUD file to say whether or not something can
talk to
other things in the local network. Maybe some reinforcement in the
discussion
about what that rule would expand to would prevent someone from
walling
off the device more than you intended (it would be a creative
mistake to do so,
I agree)
The description of the manufacturer leaf in the MUD YANG model
could be made more useful.
ALL the descriptions have been improved.
Provide a reference for "giaddr" when you use it in section 9.2.
Cleaned up (that's defined in RFC 2131, already normatively
referenced, but I expanded).
Section 14, 2nd paragraph: additional segmentation of what?
Make that "network segmentation".
Second paragraph of Section 15 - it would help to be more precise
with agency. _Who_ should review the class?
Fixed.
In the security considerations section, when you get to the "if for some
reason it is not possible to determine whether ownership has changed",
_who_ are you suggesting conduct further review?
It's always the network administrator.
==Micro-nits==
1,$s/enorcement/enforcement/g
Doh!
s/autjors/authors/
Fixed.
Thanks again,
Eliot
|