[Last-Call] Secdir last call review of draft-ietf-anima-brski-cloud-11

Mike Ounsworth via Datatracker <noreply@xxxxxxxx> · Thu, 14 Nov 2024 13:08:20 -0800

Reviewer: Mike Ounsworth
Review result: Has Nits

I have reviewed this document as part of the security directorate's
ongoing effort to review all IETF documents being processed by the
IESG. These comments were written primarily for the benefit of the
security area directors. Document editors and WG chairs should treat
these comments just like any other last call comments.

The summary of the review is Ready with Nits.

The protocol seems robust, but the document is in need of an editing pass,
particularly around "may" vs "MAY", and presentation of the security-related
concepts could be clearer.

Most of my comments can be ignored if the authors are short on time. I have
flagged the most important ones with “******”.

Security-related comments:
—---------------------------------

Section 2.2: “Note that the Pledge only sends the CSR Attributes request to the
entity acting as the EST server as per [RFC7030] section 2.6, and MUST NOT send
the CSR Attributes request to the Cloud Registrar.” … why? Is there a security
/ privacy / operational reason for this MUST NOT? Or is it simply a “this won’t
do anything”? If this is a security reason, then please be clear about what
“bad thing” happens.

Last paragraph of Section 3.1.2; there may be other DoS scenarios in which the
Cloud Registrar MAY wish to protect itself; for example if a large number of
requests come from known-malicious IP addresses, exhibit DoS style behaviours,
etc. Cloud Registrars MAY implement rate-limiting, incremental backoffs,
predictive filtering, or any other applicable DoS mitigation techniques. BRSKI
clients SHOULD be equipped with retry mechanisms appropriate to the DoS
mitigation techniques used by its manufacturer Cloud Registrar. Remember that
legitimate devices can become compromised and exhibit malicious behaviour, so
just because a Pledge device successfully completes TLS client-auth does not
mean that it should be fully trusted.

******* Section 3.3.1: The security of the jump from Cloud Registrar to Owner /
VAR Registrar relies quite heavily on the BRSKI Provisional TLS mechanism. Even
with skimming RFC 8995, the details of the Provisional TLS stuff did not become
clear to me until I reached somewhere around section 8 of this document. Since
so much of the security of this document relies upon the Pledge correctly
handling the Provisional TLS state, I highly suggest adding a section
specifically about the Provisional TLS state. Section 8.2 has some really good
stuff buried in it about what things a Pledge MAY and MUST NOT do while it is
in the "Provisional TLS" state. Also, I think it would be helpful to explicitly
spell out what validation the Pledge MUST perform in order to get itself out of
the Provisional TLS state. RFC 8995 section 5.6.2 sorta outlines this, but in
my opinion, not well enough to base the entire security of this document on.

Section 5 should include some discussion of management lifecycle of the TLS
certificates used by VAR / Owner Registrars and EST servers. IE once a
certificate has been pinned either in Pledge devices or in the vouchers of
up-stream Registrars, the operator of such infrastructure requires coordination
with their upstream registrar in order to change their certificates.

Section 8.1, last sentence: “...but do not constitute a security risk, as the
Pledge is correctly verifying all TLS connections as per [BRSKI].” I agree, but
I would strengthen that to “..., so long as the Pledge is correctly verifying
all TLS connections as per [BRSKI]” to highlight that it is tempting for Pledge
manufacturers to be loose with TLS checking around captive portals, but that
doing so will likely introduce exploitable security holes such as where an
attacker simulates a captive portal scenario in order to feed the Pledge a
forged Voucher.

Section 8.2 "There are additional considerations regarding TLS certificate
validation that must be accounted" "The Pledge should check whether the
identity of the Registrar" Should those be normative MUST and SHOULD? Or some
other intent?

******Section 9.1: In addition to trust anchor update, there is another huge
security reason to do firmware updates as the first step after waking up: to
apply any available security patches to the OS, TCP / HTTP stack, etc, to
prevent the device from becoming exploited by malicious network infrastructure.
I think this ought to be “Pledges SHOULD attempt to contact the manufacturer
and apply any available firmware patches (with any appropriate firmware
signing), and networks SHOULD allow this”. In my own professional work, there
are some classes of devices where it’s preferable for the device to brick
itself in situations like that rather than to allow itself to become
compromised, but advice like that is probably too strong for this document.

******Section 9.2: I’m not sure that this section has sufficiently clearly
drawn the through-line to the security implications. The advice in this section
feels more like “It won’t work” type advice rather than “It’ll be insecure”
type advice. If it’s not really security, then maybe move it to another section?

******Same comment about 9.3. What is the security consideration here? What is
a Pledge developer or a Registrar operator supposed to do with the text in this
section?

Non-security comments & Nits
—-------------------------------------

I feel like the use case explanations in the Introduction and Architecture
sections are overly verbose and repetitive. This could be shortened.

*****The draft contains 47 “may” and only 12 “MAY”. I suggest that each of the
47 be checked to decide if it carries the meaning of “is allowed to”, in which
case it should be a normative “MAY”, or carries the meaning of “it could happen
that..” in which case I suggest that some other wording is found instead of
“may”.

"For instance, a SIP phone might have a client certificate to be used with a
SIP proxy." Define or reference "SIP", please.

In the Terminology section: I would add some more words for OEM and VAR to
describe in English how those things are different from each other.

The way it is currently presented, it sounds like 4.1 (redirect to another
Registrar) and 4.2 (redirect to an EST server) are distinct cases, but I assume
that eventually the Pledge needs to end up at an EST server, so 4.2 (redirect
to an EST server) MUST happen eventually, with zero or more iterations of 4.1
(redirect to another Registrar) before it?

Section 2: “there are a number of parties involve” should be “involved”.

Section 3.2: “The Cloud Registrar must determine Pledge ownership” Should that
be a normative MUST? Exactly what technical action is involved in “determining
Pledge ownership”? … ah, this is explained below in 3.2.1. Then I think the
opening sentence should be “The Cloud Registrar must determine Pledge
ownership, see [3.2.1]”.

Section 4.2: typo: “If the est-domain was provided by with an IP address
literal” … “by with” seems like a grammar mistake.

Section 4.2: “The Pledge also has the details it needs to be able to create the
CSR request to send to the RA based on the details provided in the voucher.” …
but that’s not quite true, is it? It may also need to make a /csrattrs call to
the EST server, right?

Section 4.2: “In steps 5.a and 5.b, the Pledge may optionally notify the Cloud
Registrar/MASA of the success or failure of its attempt to establish a secure
TLS channel with the EST server.” It might be helpful to mention the purpose of
doing this. IE how does communicating this information benefit the Pledge? What
will the Cloud Registrar/MASA do with this information?

*****Section 4.2: “The Pledge must verify that the issued certificate in step 7
has the expected identifier obtained from the Cloud Registrar/MASA in step 3.”
I feel like this needs to describe some error handling. If it does not contain
the expected identifier, then what is the Pledge supposed to do? Is it supposed
to discard the cert and start over? Is it supposed to trigger revocation of the
mis-issued cert? If so, how?

Section 5: “The well-known URL that is used is specified by the manufacturer
when designing it's firmware” should be “its”.

Section 8.1: “A Pledge may be deployed in a network where a captive portal or
an intelligent home gateway that provides access control on all connections is
also deployed.” I think here you don’t mean “may” as in “is allowed to” (which
is the 2119 meaning of the word) but rather “might find itself”. I suggest that
“A Pledge might find itself deployed…” would be clearer.

Section 9: “This internet accessible service may be operated by the
manufacturer and/or by” “a Pledge that may have been in a dusty box” Again, I
think this wants to be a “might” rather than a “may” since we want the non-2119
meaning of the word here.

Section 9.2: “The Cloud Registrar may have a certificate”
“it is recommended to limit the number”
“the Cloud Registrar may have a certificate that can”
“The Pledge may have any kind of Trust Anchor built in”
This feels like it wants to be normative MAY, RECOMMENDED.

There are a couple instances of markdown section references that did not build
properly, such as “{bootstrapping-with-no-owner-registrar}”.

Section 9.4: “the Cloud Registrar actually does all the voucher processing as
specified in [BRSKI].” Should that be a “MUST” ?

-- 
last-call mailing list -- last-call@xxxxxxxx
To unsubscribe send an email to last-call-leave@xxxxxxxx