Hi Jurgen,
Thanks for the review. I sympathize with your confusion issues. Many times I shared the same confusion on other IETF documents that I thought relevant for my work. IETF documents are not encouraged to rephrase parts of other RFCs or provide large operational HOWTO considerations. Actually, in other documents that I co-authored people were not happy about the large number of examples we provided. In my view the document should state the problem that is being solved, and the standard that proposes to remove the problem. I tried to do that in this document.
See below for my comments,
much useful text is added in response.
Greetings,
Peter
_____________________________________________________________________
Reviewer: Jürgen Schönwälder Review result: Serious Issues
Let me start with a disclaimer: I am not familiar with BRSKI and ANIMA and hence I have been reading this I-D as a confused outsider and some of my concerns may not be valid or the result of me not understanding the relevant technologies. That said, my conclusion after reading that document is that it is not ready. At a high level, my concerns are:
- First, it seems to me that there are many options and there is no clear mandatory to implement baseline. Hence, there I am concerned that this specification will not necessarily lead to interoperable implementations.
Pvds ==>
We could add normative language for one option only. We prefer that based on use cases, an installation engineer could choose one option over the other. The simplest option is stateful which is common in today's translation devices, but again other use cases may not want to implement that and just do stateless. I think it is hard for us to choose between these two options.
==>
- Second, it feels like more attention needs to be payed to security concerns. Some of the options may actually be weak from a security point of view and hence narrowing options down may also be desirable to deal with security concerns. I do not think it is sufficient to state that some security issues may be solved by future work.
Pvds ==>
That will be changed. Below some text suggestions are done.
==>
- Third, as an ops-dir reviewer, I am lacking information how this will be operationally deployed, i.e., how a shared link will be properly configured that may have multiple mechanisms to bootstrap routable IP addresses. How do I force pledges to go through this procedure before I hand out or let them discover a routable IP address?
pvds==>
I am confused here; what is a shared link?
Actually, for the link-local discovery, the document relies fully on techniques which are described in other RFCs. The document does not add anything, apart from the character sequences that need to be registered by IANA.
A good point is perhaps that the use of a mesh network should be emphasized.
OLD:
However, the Pledge will not be IP routable until it is authenticated
to the network. A new Pledge can only initially use a link-local
IPv6 address to communicate with a neighbor on the same link
[RFC6775] until it receives the necessary network configuration
parameters. However, before the Pledge can receive these
configuration parameters, it needs to authenticate itself to the
network to which it connects.
NEW:
However, the Pledge will not be IP routable over the mesh network
until it is authenticated to the mesh network. A new Pledge can only
initially use a link-local IPv6 address to communicate with a
mesh neighbor [RFC6775] until it receives the necessary network
configuration parameters. The Pledge receives these configuration
parameters from the Registrar. When the Registrar is not a direct
neighbor of the Registrar but several hops away, the Pledge
discovers a neighbor constrained Join Proxy, which transmits the DTLS
protected request coming from the Pledge
to the Registrar. The constrained Join-Proxy must be enrolled
previously such that the
message from constrained Join-Proxy to Registrar can be routed over
one or more hops.
==>
I also wonder whether alternatives been considered. Is it really necessary to introduce proxies that rewrite IP addresses? Could it be easier to let Pledges discover special temporary addresses that can be used to reach (without going through a Join Proxy) the Registrar and once a Pledge gets enrolled, it can pick up a more general address? Or is the stateful solution not simply the more robust solution? How many enrollments do we expect a Join Proxy to handle concurrently? What are the bulk enrollment scenarios where a stateless solution would be desirable?
I skimmed through draft-richardson-anima-state-for-joinrouter-03, which has more alternatives. While properties of various solutions are discussed, no clear conclusions are drawn. Back to this document, perhaps I am missing also an applicability statement for the Join Proxy solution.
Pvds==>
The number of simultaneous enrollments will depend heavily on the operational conditions and chosen physical installation procedure. It may range from one every 15 minutes to a few hundred in half an hour. I doubt that the latter frequency will ever be attained, but I have been amazed about deployments in the past. In short, I don’t know.
This solution was chosen because the original BRSKI documents mentions a circuit proxy for https. This constrained proxy uses DTLS with coap and requires a low number of changes to the original BRSKI document. Also draft-richardson-anima-state-for-joinrouter was exploring various options, but it does not mean these are deployable. Most overlap with the two options that we have in this draft. I think adding that many options will probably add to the confusion and add burden for vendors to support them all.
==>
* Abstract
I find the abstract difficult to understand for people not familiar with the context of this work. You have to read until the 2nd paragraph to get a clue that this has something to do with BRSKI, I think this should be said right away in the first sentence so that people know that what follows is about BRSKI specific concepts.
Pvds==>
Good suggestion; will change the paragraph order
==>
And ideally the abstract would be understandable to people not deeply familiar with BRSKI terminology and concepts. After reading
This document extends the work of Bootstrapping Remote Secure Key Infrastructures (BRSKI) by replacing the Circuit-proxy between Pledge and Registrar by a stateless/stateful constrained Join Proxy. It relays join traffic from the Pledge to the Registrar.
I had little clue what this document is about. Perhaps explaining things in simpler terms can help, e.g., something like this:
This document extends the work of Bootstrapping Remote Secure Key Infrastructures (BRSKI) by specifying how a Join Proxy can relay a DTLS session originating from a Pledge with only link-local addresses to a Registrar not directly reachable on the link to which the Pledge is connected.
Pvds==>
My suggestion (I leave Circuit-proxy which is essential IMO):
NEW
This document extends the work of Bootstrapping Remote Secure Key Infrastructures (BRSKI) by replacing the Circuit-proxy between Pledge and Registrar by a stateless/stateful constrained Join Proxy. The constrained Join Proxy is a mesh neighbor of the
Pledge and can relay a DTLS session originating from a Pledge with only link-local addresses to a Registrar which is not a mesh neighbor of the
Pledge.
==>
The title and the abstract both use the term "constrained Join Proxy" but later almost always the term "Join Proxy" is used. So why is it a "constrained Join Proxy" and not just a "Join Proxy", or is there a difference between a "Join Proxy" and a "constrained Join
Pvds==>
Good point.
Either I write the constrained before every Join Proxy or I introduce a phrase stating that they describe one and the same concept. Not clear yet what I will do.
==> Proxy"? The captions of Fig. 2 and Fig. 3 state that they show a constrained joining message flow. Can there be others or is this technology for some reason only applicable for some sort of constrained devices?
* Join Proxy functionality
I found the text a bit confusing. It talks about why packets to establish a DTLS connection with a Registrar won't be delivered and then afterwards it says that the Pledge is not even able to discover the IP address of the Registrar. Perhaps this text can be simplified and streamlined. It is rather obvious that if a Pledge has only a link-local address, it won't talk with a Registrar multiple IP hops away.
Pvds==>
Now I am confused. I expected you to require more text here.
Something seems to be missing in the description of the base line scenario, and I need more info to understand what the missing pieces are.
==>
Are both modes required to be implemented? The stateless approach seems to require support by the Registrar while the stateful approach seems to be transparent from the Registrar's perspective. This apparently makes a big difference for the deployment options. To deploy the stateless Join Proxy somewhere in a big network, you need to update the Registrar to support it, right?
Pvds==>
Yes, figure 5 states the discoverable port in the Registrar.
==>
IP_P:p_P = Link-local IP address and port of the Pledge IP_R:p_Ra = Routable IP address and join-port of Registrar IP_Jl:p_Jl = Link-local IP address and join-port of Join Proxy IP_Jr:p_Jr = Routable IP address and port of Join Proxy
I was wondering why this is p_Ra, i.e., what the 'a' stands for. Or why is this not:
IP_Pl:p_Pl = Link-local IP address and port of the Pledge IP_Rr:p_Rr = Routable IP address and join-port of Registrar IP_Jl:p_Jl = Link-local IP address and join-port of Join Proxy IP_Jr:p_Jr = Routable IP address and port of Join Proxy
Well, how things are labeled may not be really important.
Pvds==>
This has been adapted as suggested by Rob Wilton in the AD review
==>
I wondered: How does this all interact with SLAC and/or DHCP on a shared link? You seem to assume that SLAC and/or DHCP are disabled as long as a Pledge is not yet enrolled, right? In some networks, you will have also 802.X for enabling layer 2 ports. How do all these things fit operationally together? What are operationally meaningful setups? In a shared network scenario, how do I effectively prevent a Pledge from using router advertisements to generate a routable address? Or is in such a deployment a Join Proxy simply not necessary? Perhaps these questions go beyond this document and they just show my lack of background.
Pvds==>
Only DTLS connections are allowed on the BRSKI mesh network. Certificates which are signed by the Registrar are used to set up the DTLS connections. Non protected messages may be routed but will never be accepted by the recipient.
==>
Are there any message size issues since the stateless solution encapsulates the DTLS payload in another header? I see that this is mentioned in the table at the end as a property of the stateless mode, there is no discussion of any consequences this may have.
Pvds==>
No discussion is given, not knowing all operational conditions.
Installation engineers are given the choice.
==>
There are three different discovery options. Are all three mandatory to implement? Is having many options to start with desirable from an interoperability point of view?
Pvds==>
Bob Wilton also commented on this aspect; that has been changed in the latest version
==>
I tried to figure out how in 6.1.1 the Registrar is found. I followed several references, discovered several options, ended up in GRASP as one of them. Once I have the registrar's address, I can query the Registrar for more details. Then we have 6.1.2 which details how GRASP can be used directly to provide all relevant information. This section says it is "normative for uses with ANIMA ACP". Not sure what that means, did they authors mean that it is mandatory to implement for ANIMA ACP or that it is mandatory to use for ANIMA ACP? Normative feels like the wrong word, or is the other text not normative or what is conditionally normative in which contexts? As a newcomer, I only found section 6.3.1 reasonably clear (there is a link-local coap multicast, I can see how that works).
Pvds==>
Not sure about “normative for use” or “normative to implement”; Does “normative for use” imply “normative to implement”?
==>
* Security Considerations
There may be more security relevant questions. How robust is this design against attacks? Can this be exploited for attacks? How does a join proxy decide which (DTLs) traffic should be forwarded and which should not be forwarded, or is the idea that any traffic is forwarded? Is the Join Proxy required to verify that the forwarded traffic is actually (valid) DTLS traffic?
pvds==>
Good Point. In my understanding only DTLS connections are accepted by the destination. Refusing to route non DTLS traffic may be a bit prohibitive. The suggestions is to add the following text after the first paragraph.
NEW
A malicious constrained Join Proxy has a number of routing possibilities:
- It sends the message on to a malicious Registrar. This is the same case as the presence of a malicious Registrar discussed in RFC 8995.
- It does not send on the request or does not return the response from the Registrar. This is the case of the not responding or crashing Registrar discussed in RFC 8995.
- It uses the returned response of the Registrar to enroll itself in the network. With very low probability it can decrypt the response. Successful enrollment is deemed too unlikely.
- It uses the request from the pledge to appropriate the pledge certificate, but then it still needs to acquire the private key of the pledge. Also this is assumed to be highly unlikely.
A malicious node can construct an invalid Join Proxy message. Suppose, the destination port is the coaps port. In that case, a Join Proxy can accept the message and add the routing addresses without checking the payload. The Join Proxy then routes it to the Registrar. In all cases, the Registrar needs to receive the message at the join-port, checks that the message consists of two parts and uses the DTLS payload to start the BRSKI procedure. It is highly unlikely that this malicious payload will lead to node acceptance.
A malicious node can sniff the messages routed by the constrained Join Proxy. It is very unlikely that the malicious node can decrypt the DTLS payload. A malicious node can read the header field of the message sent by the stateless Join Proxy. This ability does not yield much more information than the visible addresses transported in the network packets.
==>
The stateless proxy seems to allow outside attackers to send arbitrary packets to any link-local address inside.
Pvds==>
Like any node that can send link-local broadcast and unicast; I don’t think this is specific to the constrained Join Proxy.
==>
This looks like a new reflection service that must be kept operationally under control, in particular since enrolled Pledges may later act as well as Join Proxies. The security considerations text indicates that future work may address this issue by encrypting the CBOR array. Is this sufficient, do we really want to standardize a new reflection service that we then fix in the future? I am also not sure why level 2 protection (what is 'level 2'? layer 2? link-layer protection?) will actually resolve the problem, once I can route IP packets to a Join Proxy, I can let it forward traffic to arbitrary link-local addresses, no?
Pvds==>
No; only DTLS packets can be sent to Registrars. The latter decides in combination with manufacturer’s MASA if a node can be accepted in the network.
Level 2 => layer 2
Some new text is proposed.
OLD
If such
scenario needs to be avoided, then it is reasonable for the Join
Proxy to encrypt the CBOR array using a locally generated symmetric
key. The Registrar would not be able to examine the result, but it
does not need to do so. This is a topic for future work
NEW
If such
scenario needs to be avoided, the constrained Join
Proxy MAY encrypt the CBOR array using a locally generated symmetric
key. The Registrar is not able to examine the encrypted result, but
does not need to. The Registrar stores the encrypted header in the return packet without modifications. The constrained Join Proxy can decrypt the contents to route the message to the right destination.
==>
Is there anything that prevents an attacker from creating a packet with a stack of JPY_messages, effectively source routing messages through a chain of Join Proxies? How will I debug such things if they happen?
Pvds==>
Interesting. In the added security text, I hope you agree to the answer. I don’t think debugging is necessary; although detecting malicious nodes is always a challenging occupation.
==>
|