[Last-Call] Re: Secdir last call review of draft-ietf-ace-revoked-token-notification-06

Marco Tiloca <marco.tiloca=40ri.se@xxxxxxxxxxxxxx> · Fri, 17 May 2024 19:06:36 +0200

    Hello Kyle,

      Thanks for your additional comments! Please find our replies
      inline below.

      Best,

      /Marco

    On 2024-05-13 22:47, Kyle Rose wrote:

          On Fri, May 10, 2024 at
            9:12 AM Marco Tiloca <marco.tiloca@xxxxx>
            wrote:

                * The security issue outlined in section 13.5 ("Dishonest clients") adequately
justifies maintaining confidentiality of the full list of revoked hashes, or at
least of recent additions to the list, without reference to privacy as
mentioned in section 13.1. The privacy issues posed by hashes of tokens that
are not widely distributed or visible to passive observers are not at all clear
to me.

              ==>MT

              Quoting the privacy-related sentence in Section 13.1:

              > Disclosing any information about revoked access
              tokens to entities other than the intended registered
              devices may result in privacy concerns.

              Admittedly, it is hard to think of immediate, severe
              privacy consequences from revealing token hashes and their
              insertion/deletion in the update collection of
              non-pertaining parties.

              We included that sentence with the intentionally open "may
              result", as a way to err on the side of caution.

            I guess I
              would say that unless you can articulate a specific
              privacy implication, stating that there "may result" some
              privacy concern is more CYA than actually informational to
              readers or implementors. When this is done broadly, it
              winds up sounding a bit like the CA prop 65 warnings on
              literally everything in the sense that it provides no
              useful guidance on how to evaluate alternatives against
              each other. So I agree with removing this *unless* you can
              articulate a specific privacy concern.

    ==>MT2

    Good. Then we confirm to remove that sentence, as already done in
    [PR].

    [PR]
    https://github.com/ace-wg/ace-revoked-token-notification/pull/10

    <==

                * Furthermore, doesn't the security issue identified in 13.5 imply that only
RSes should be notified of revocations, and clients left to wonder until their
requests are denied, or at least until after the RSes to which the token is
relevant have been notified? Secrecy matters really only because we want to
prevent bad actors with access to the token from taking advantage of it during
the window between revocation and RSes being aware of that revocation: if you
notify the bad actor proactively, it doesn't really matter that you kept it
secret from other nefarious observers. (Note that changing this would require
the changes to the recommendation from 13.4.)

    For what it's worth, this is not a novel problem, and it is one that
    plagues revocation systems in general. Moreover, all the possible
    approaches are unsatisfying in some way, often relying on assumptions about
    the state of mind/intent or expected behavior of the adversarial actor.

              ==>MT

              We did not mean to imply that only RSs should be notified
              of revoked access tokens, and the intention was not to
              have Clients left to wonder.

              In fact, an access token might be revoked because the RS
              is found compromised or suspected so. In such a case,
              informing of the revocation is first of all in the
              interest of the Client, in order to protect the Client
              from accessing resources at an RS that is deemed
              malevolent or not appropriate to access.

            It seems
              like the right way to terminate the ability of a
              compromised RS to communicate is to revoke its server
              certificate so all clients will fail (D)TLS handshakes,
              tokens aside. From a security engineering perspective,
              this is really the proper solution: revoke the credential
              that authenticates the server. (The access tokens, by
              contrast, are really credentials that authenticate and
              authorize the client.)

            But if
              you decide instead to move ahead with revoking tokens as a
              proxy for deauthenticating the server, perhaps the
              guidance should be to leave the suspected compromised
              party wondering and only proactively notify the other
              parties to the token. That implies maybe having two
              classes of TRL: one for clients and one for RSes, and to
              issue the revocation to one or the other depending on the
              situation.

            All of
              this is lipstick on a pig, however, as the unavoidable
              problem with revocation is the polling period of the list.
              There's a certain degree of "best effort" involved in
              using revocation of offline credentials to prevent
              interactions after learning of a compromise. Maybe that's
              what really should appear under security considerations: a
              statement of the inherent limitations of revocation lists
              and how that should be taken into consideration when
              designing systems that leverage these tokens for
              authentication and authorization.

    ==>MT2

    Please note that there are different possible reasons for revoking
    an access token, many of which are not related to a registered
    device being compromised or suspected so. Examples of those reasons
    are listed in the second paragraph of Section 1.0 "Introduction". In
    such cases, there are no "bad actors".

    Also related to this, the GENART review archived at [GENART-REVIEW]
    noted that

    > ... the process(es) by which a token is declared revoked, and
    the method by which the AS is notified of that (and consequently
    updates the TRL), is out of scope.  That fact is implicit in this
    document, but stating it ensures someone doesn't hunt through this
    document looking for a specification of the revocation process.

    We already addressed that point by adding the following text, as a
    new paragraph at the end of Section 1.0 "Introduction":

    > The process by which access tokens are declared revoked is out
    of the scope of this document. It is also out of scope the method by
    which the AS determines or is notified of revoked access tokens,
    according to which the AS consequently updates the TRL as specified
    in this document.

    That is, upon the revocation of an access token, the AS might not
    know that a registered device to which the access token pertains has
    been specifically compromised or misbehaving.

    Even if the registered device is indeed compromised and the AS was
    aware of that, the AS is unlikely to be in the position to
    specifically revoke the long-term authentication credential of that
    registered device, and thus to prevent from interacting with it
    altogether. If deemed appropriate, that's definitely something
    important and fitting to do for the issuer of such authentication
    credential. (Side note: in its asymmetric mode of operation, the
    DTLS profile of ACE specified in RFC 9202 considers only raw public
    keys, but not full-fledged certificates)

    What the AS is certainly and autonomously supposed to do is to
    officially declare an access token as revoked (irrespective of the
    specific reason), update its TRL accordingly, and allow the devices
    to which the access token pertains to learn about that, through the
    method defined in this document. Until a new access token is issued,
    this results in terminating the associated secure communication
    association between the Client and Resource Server for which the
    access token was issued, and prevents the Client from accessing
    protected resources at the Resource Server.

    About the polling period on the TRL, there is certainly a trade-off
    between that period and the ability to stay aligned with pertaining
    access tokens that have been revoked. The (additional) use of CoAP
    Observe (RFC 7641) as a subscription mechanism helps in this
    respect, and the security considerations in Section 13.3
    "Communication Patterns" do recommend about not relying solely on
    that, but also on an appropriately tuned polling interval.

    That said, this discussion has made it clear that the document was
    missing further security considerations on what the TRL alone does
    *not* provide. That is:

    * From the TRL, the registered devices learn that a pertaining token
    has been revoked, but not the reason why, and not if that reason is
    a compromise, misbehavior, or decommissioning.

    * In the particular case where a registered device is compromised,
    misbehaving, or decommissioned, it might not be enough to only
    revoke its pertaining access tokens. That is, the entity that
    authoritatively declares a registered device to be compromised,
    misbehaving, or decommissioned should also promptly trigger the
    execution of additional revocation processes as deemed appropriate.
    These include, for instance:

      - De-registering the registered device from the AS, so that the AS
    does not issue further access tokens pertaining to that device.

      - If applicable, revoking the public authentication credential
    (e.g., the public key certificate) associated with the registered
    device.

      The methods by which these processes are triggered and carried out
    are out of the scope of this document.

    Within Section 13 "Security Considerations", we have now added a new
    subsection "Additional Security Measures" discussing the limitations
    and additional expected actions above. This is captured in the
    commit at [COMMIT].

    [GENART-REVIEW]
    https://mailarchive.ietf.org/arch/msg/ace/ETtaBMaSyoZKMD82kgG49P2cF9U/

    [COMMIT]
https://github.com/ace-wg/ace-revoked-token-notification/pull/10/commits/a64db82a2d000cdcc365406abde658062fa87083

    <==

            NEW (emphasis mine):

              > This can be due to different reasons. For example,
              the access token has actually been revoked and the Client
              is not aware about that yet, while the RS has gained
              knowledge about that and has expunged the access token.
              **As another example, the access token is still valid, but
              an on-path active adversary might have injected a forged
              4.01 (Unauthorized) response, or the RS might have deleted
              the access token from its local storage due to its
              dedicated storage space being all consumed.**

            How can
              an on-path active adversary inject forged messages into
              communication between two endpoints? I admit to having
              very little knowledge of ACE: is the communication not
              end-to-end integrity protected with server authentication,
              a la DTLS? (If that's not the case, then frankly all bets
              are off.)

    ==>MT2

    Citing also the immediately previous paragraph from the same Section
    13.4 (emphasis mine):

    > If a Client stores an access token that it still believes to be
    valid, and it accordingly attempts to access a protected resource at
    the RS, the Client might anyway receive an **unprotected** 4.01
    (Unauthorized) response from the RS.

    Thinking of the attack-free case first, the RS may have deleted the
    access token due to memory limitations, after which the RS
    terminates its secure association with the Client. Depending on the
    specifically used secure communication protocol, the Client might
    not be aware of that termination.

    When later sending a protected request to access a resource at the
    RS per the still valid access token, the RS will reply with an
    unprotected 4.01 (Unauthorized) response, which may specifically be
    used to convey an "AS Request Creation Hints". Due to the above, the
    RS has in fact no means to protect that response anyway. (Please
    refer to Sections 5.10.1.1, 5.10.2, 6.4, and 6.8 of RFC 9200 for
    further details)

    If instead the RS is still storing an access token and it still
    shares an active secure communication association with the Client,
    an adversary can block the protected request from the Client, inject
    an unprotected 4.01 (Unauthorized) response in reply to the Client,
    and thus make the Client believe that the RS is not storing the
    access token anymore.

    <==

             1. The AS issues an access token TOKEN with a lifetime
              of X seconds. Instead of the 'exp' claim specifying an
              expiration time, TOKEN includes the 'exi' claim with value
              X (see Section 5.10.3 of RFC 9200). Then, the AS provides
              C with TOKEN.

            Aha, so
              "exi" is the source of the problem. If that weren't
              permitted by spec, and explicit deterministic timestamps
              were required for expiration, you would eliminate this
              entire class of problem. I suggest doing an RFC 9200bis to
              remove things like this, and maybe get the entire
              ecosystem reviewed by security experts to avoid basic but
              preventable architecture flaws that unnecessarily
              complicate the security story.

    ==>MT2

    Well, 'exi' certainly made it more difficult to design what is
    specified in Section 10.1 of this document :-)

    At the same time, the introduction of 'exi' in the ACE framework was
    motivated by the need to "support token expiration for devices that
    have no reliable way of synchronizing their internal clocks" (see
    Section 5.10.3 of RFC 9200), and thus cannot afford using 'exp'
    anyway.

    Please note that a dedicated handling of 'exi' was defined in
    Section 5.10.3 of RFC 9200, and its limitations/drawbacks were
    documented in the security considerations compiled in Section 6.6 of
    RFC 9200.

    <==

             <==

                Other comments:

* I did not review the properties of, or analyze the correctness of, the
database consistency algorithm and associated update protocol used to keep
registered devices up-to-date with relevant token hashes. I do not know if this
algorithm and protocol were based on something specific, so if that is not the
case then my main observation would be that the consistency requirements here
are not unique to the proposed revocation function, so it may be worth
reviewing literature relevant to the problem space associated with database
view consistency across distributed and intermittently-connected devices to see
if there is something more generic that can be leveraged in solving this
problem.

              ==>MT

              We separately reply about the two different aspects raised
              in the comment.

              **Database consistency algorithm** - This is an important
              component of an AS that relies specifically on a database.

            I'm using
              the term "database" abstractly. In the simple case, this
              represents what every node in the network would see if
              they were synchronously accessing the same single copy of
              the data, which is by definition always consistent with
              itself.

            By
              contrast, in the case described in this document, the
              database (the TRL of unexpired tokens) is distributed into
              instances (the RSes) with updates flowing to it at
              different times (via TRL polling), which means you'd
              ideally like some kind of consistency model (e.g.,
              sequential consistency, serializability, etc.) to allow
              you to reason about what two nodes acting independently
              might see when querying their local instances at a
              particular time.

            This is
              one of the foundational problems in distributed systems.
              There's no reason to reinvent the wheel here. My
              recommendation is that if there is a model that does what
              you want in the distributed systems literature, you should
              just implement that. But deriving benefit from using a
              standard database consistency model might also depend on
              eliminating things like the aforementioned relative expiry
              "exi", which complicates not the database update but a
              node's behavior at the moment a token is received.

    ==>MT2

    The TRL is a resource hosted only at the AS, and is not distributed
    into instances. In particular:

    * Per Section 4, the TRL is a single data structure hosted only at
    the AS.

    * Per Section 4.1, the AS is the only writer of the TRL, as the only
    authoritative responsible of the information in the TRL. This is
    consistent with the AS being the only issuer of the access tokens
    under consideration.

    * Per Sections 5-8, the registered devices and the administrators
    are readers of the TRL.

      The retrieval of information from the TRL relies on RESTful
    interactions with the AS, specifically through a request with the
    idempotent and safe method GET, sent to the TRL endpoint at the AS.
    The corresponding response from the AS conveys information that is
    consistent with the same current representation of the TRL at the AS
    at that time.

      Until obtaining a next response from the AS, a node does not
    really "query its local instance", but instead sees what it has been
    storing since receiving the latest response from the AS.

      When a node additionally uses CoAP Observe (RFC 7641) as a
    subscription mechanism, the Observe notification responses from the
    AS provide "eventual consistency" to that node (see Section 1.3 of
    RFC 7641), again with respect to the only authoritative
    representation of the TRL at the AS.

    * Accessing the TRL at the AS is the only way for the readers to
    obtain that information.

      That is, a registered device or administrator does not obtain that
    information by accessing a different endpoint at another registered
    device or administrator.

    Any two nodes that query the TRL at the same time obtain a response
    that is consistent with the same, current representation of the TRL
    at the AS at that time. Therefore, there seems to be no particular
    issue to address.

    Certainly, two nodes might obtain different responses built on
    different versions of the TRL at the AS, if they query the TRL at
    different times, as the AS updates the TRL.

    However, the impact of a node having an outdated version of the TRL
    is limited to that specific node that has not yet queried the TRL
    since the last TRL update at the AS.

    Even though each reader thinks for itself and based on its current
    view of the TRL, there is clearly a trade off between the query rate
    used by that node and the chance for that node to have a current
    outdated view. In this respect:

    * The (additional) use of CoAP Observe helps, and the security
    considerations in Section 13.3 "Communication Patterns" recommend
    about not relying solely on that, but also on an appropriately tuned
    polling interval.

    * As a result of addressing the OPSDIR review archived at
    [OPSDIR-REVIEW], we have also extended Section 10.0 "Notification of
    Revoked Access Tokens" with additional text that concludes with

      > In order to limit the amount of time during which the
    requester is unaware of pertaining access tokens that have been
    revoked but are not expired yet, a requester SHOULD NOT rely solely
    on diff query requests. In particular, a requester SHOULD also
    regularly send a full query request to the TRL endpoint according to
    a related application policy.

    [OPSDIR-REVIEW]
    https://mailarchive.ietf.org/arch/msg/ace/ElqlgO6FHPsjoqw7L3gKkbSqVUo/

    <==

          Kyle

    -- 
Marco Tiloca
Ph.D., Senior Researcher

Phone: +46 (0)70 60 46 501

RISE Research Institutes of Sweden AB
Box 1263
164 29 Kista (Sweden)

Division: Digital Systems
Department: Computer Science
Unit: Cybersecurity

https://www.ri.se

Attachment:
OpenPGP_0xEE2664B40E58DA43.asc

Description: OpenPGP public key
Attachment:
OpenPGP_signature.asc

Description: OpenPGP digital signature
-- 
last-call mailing list -- last-call@xxxxxxxx
To unsubscribe send an email to last-call-leave@xxxxxxxx