Re: Oauth -- shall we start using this?

Xavier Lamien <laxathom@xxxxxxxxxxxxxxxxx> · Sat, 9 Mar 2013 15:03:59 +0100

On Fri, Mar 8, 2013 at 11:07 PM, Toshio Kuratomi <a.badger@xxxxxxxxx> wrote:

So in the past week a bunch of us have been talking about API Keys, OAuth,

passwords, and other means of managing authn and authz in the web apps that

are up and coming (specifically mentioned were copr and datagrepper).

Puiterwijk has put in some time reading the OAuth specifications and on

Friday he walked me through how OAuth is supposed to work.  I'll give a

summary of his talkl here and then we can kick off some discussion.

OAuth is a standardized method for a user to grant access to resources that

they own to people and things that are not themselves.  Currently this is being

used to allow a user to control the access to data and actions that may be

performed on one web service by another web service.  The concepts and

mechanisms can be used in any situation where the user wants to limit what the

software they are using can do on their behalf.

= Part I: What is OAuth? =

<snip> 

== Flow of a basic request from start to finish ==

* The client program needs to get access to a protected resource.

* The client asks the authorization server for a client-id and tells the server

  which permissions it needs

* The authrization server gives a url to the client

* The client program redirects the user to that url so the user can grant

  permissions to the client

* The authorization server authenticates the user (ie: they login to the

  authorization server).

* The authorization server asks the user to confirm they want to grant the

  requested permissions to the client.

* If the answer is no, the protocol ends.

* If he answer is yes, the user is redirected to the client with an

  `authorization code` in the request

* The client sends the `authorization code` to the authorization server.

* The authorization server generates an `access token` with the specific

  permissions that the client requested, expires the `authorization code`, and

  returns the `access token` to the client,

* The client requests the protected resource from the resource server using the

  `access token`.

* The resource server verifes that the `access token` is valid.  If it is, it

  allows access

.. note:: the `authorization code` is only good for retrieving a single

    `access token` for the particular set of permissions that the user

    confirmed.

.. note:: A client can request access to multiple resources at once.  Assuming

    the resource owner accepted all of them, the access token the client

    receives at the end will allow access to all of those.  A client typically

    has one access token from an authorization server that grants it all needed

    permissions on all of the resource servers that the authorization server

    can give out permissions for.  It is possible for a client to have multiple

    access tokens with different permissions from the same authorization server

    but the client would have to keep track of which permissions were granted

    by which token (and the user would have had to confirm that the client

    should be granted each set of permissions).

.. question:: an access token can contain permissions for multiple resource

    servers.  How do we secure the token from being used maliciously by a

    different resource server?  ie: I get an access token which grants some

    permissions on both fas and bodhi.  I send that access token to fas to

    retrieve some information.  What prevents fas from hanging onto that token

    and using it to access the protected resources on bodhi that it grants

    without my knowledge?

Every request send to auth server or resources server have to be signed 
with a consumer secret related to the token/access_token which mean that any other program than

the one which get that access token can't get through

== But wait, there's more! ==

We've now seen one authorization via oauth.  But Oauth is flexible.  There's a

few different ways this can work to be aware of:

* Other ways to request the access token.  The example above is what works best

  for third-party web clients.  However, there's other flows that might work

  better for CLI apps or "trusted" web clients

  - Implicit: user gets the access token directly from the authorization server

    rather than through a authorization code.  This sortcut is useful when the

    client is entirely in the browser (no third-party server involved).  With a

    third party server, the authorization code makes it so the user never sees

    the actual access token, only the authorization code.  if the client is

    running on the user's machine anyhow, there's no sense in that step.

  - Resource owner password credentials: The resource owner provides their

    credentials (username and password) to the client.  The client retrieves

    the access token from the authorization server using the credentials.  Then

    it discards the credentials and only keeps the access token for further

    requests.

  - Client credentials: Just defines that if the client is the resource server,

    it can authenticate itself to access its own resources... I'm a little

    unclear on this but I think one use would be for a resource server to use

    its externally available functions (which are protected by oauth) rather

    than having to write an equivalent function that is usable internally.

    puiterwijk mentions a different use: having a strict separation between

    tenants in the resource server's model and then having to prove you have

    permission to access the resource from a different tenant (not something

    we're likely to do).

* Verification of the access token can take many forms.

  - The authorization server could notify the resource server whenever a new

    access token is issued/revoked

  - The resource server could ask the authorization server to verify the token

    each time it receives one

  - The token could be signed by the auth server and thus be verifiable in and

    of itself.  The token could then contain the list of permissions so that

    the resource server would just consult the token to know what was

    available.  This should not be preferred as it makes revoking a token

    harder.

* The authorization server may or may not know about the range of permissions

  that it can grant.  The resource server needs to interpret what the

  permissions the access token grants mean so if the authorization server

  grants a made-up permission the application should just ignore it.

.. question:: Is it possible for the user to grant some of the requested

    permissions and deny others?  Or is it all or nothing?

It's all or nothing.
It's obvious, if you deny access to requested resources, the related token get revoked.

We have a case at work where we have 20 tokens for one resources server.
it's just a matter of security level/choice.

== Refreshing a token and its caveats ==

An access token can have an expire time.  The expire time can be coupled with a

second token called a refresh token.  Usually the refresh token would expire

sometime after the access token would expire.  When the access token expires,

the refresh token could be used by the client to request a new access token

without prompting the user.  This is indended to protect against an attacker

who is sniffing packets from amassing enough ciphertext from multiple uses of a

single access token to be able to brute force that token.

This sort of automatic expiration and refresh **is not** meant to protect the

user in case the access token is copied without their knowledge (because the

refresh token can be copied at the same time).

= Part II: How do we use this? =

This section is less about OAuth itself but some proposals about how we can

best code OAuth usage in our web applications to be secure and featureful.

== Session vs token ==

Currently we have a concept of a session in all of our web apps.  You login.

Once you're logged in, the web app knows that future connections from your web

browser/CLI/etc are being made by you.  At some point the session expires or

you explicitly log out.  At that point, the session is over.  The expiration

time for most of our apps is currently 20 minuts of idle time but we've talked

about increasing this in the past.  Sessions in my mind should last tens of

minutes to hours.  Certainly no more than a day.  A session conceptually tells

the server that the user is present and interacting with the website (by saying

that the user has "recently" authenticated).

Tokens are more akin to passwords coupled with a restricted set of permissions.

They're intended to be valid for days to weeks.  Refresh tokens can (but don't

necessarily) be used to keep a low amount of ciphertext in the system while

still making authentication via access token transparent to the client.

Conceptually, they tell the server that the **client** (not user) is the same

one that was granted the permissions.

=== Using tokens to implement sessions ===

* Sessions need to be short term -- expiration would need to be low (perhaps an

  hour).  No possibility to refresh the token.  If you need to continue, you

  have to re-send your username + password (+ otp?)

* We want this specific token to represent that the user is present, not just

  that the client has been delegated permissions.

* It would make sense for the token to give out all permissions that the user

  has (at least, on this resource server) because the user is present. Example

  token permission: "*@*" permissions token

* If possible, saving this type of session token into a wallet/keyring would

  make sense as that would encrypt the on-disk representation.  However, we'd

  also have to account for the fact that these services might not be present.

* Suggested to have access tokens with validity of 5 minutes.  refresh tokens

  of 20 minutes.  This would approximate our current cookie-based idle timeout.

.. question:: Can we also have a maximum number of refreshes or maximum time

    before the user has to reenter their credentials (username + password

    (+otp?))

== Some proposed best practices ==

* Oauth allows for very granular permissions.  You could put a separate

  permission on each resource that a client can request.  However, it doesn't

  require that you are granular or not because the application interprets the

  meaning of the permission.  A lazy resource server could have a single

  permission that covered anything that can be performed on the server but this

  means that a stolen token can be used to do anything that that user could do

  on that resource server.  We should attempt to identify common use cases and

  code separate permissions for them.  ie: "building a package in a copr" would

  belong in a separate permission from "creating a new copr".

I'm definitively +1 on this one.

* An access token should not be taken to represent the presence of the user.

  It means the user has delegated permission to perform this action to some

  "client".  It is possible that the client is a command line app or an api and

  the user is interacting with it directly but it cannot be assunmed that this

  is the case.

+1 Also, all allowed access should be revoke-able by the user at any time.

* Following from that, changing authentication methods, password, yubikey,

  security questions, etc should never be allowed via an access token.  We want

  the user to be present to change these settings.

+1

* Tokens and sessions should not contain information about the authentication

  status.  They should not contain what permissions are held or when the

  session expires.  These are for the resource server and authorization server

  to determine.

Yeah, that's the resource sever which actually defines what third-parties are allowed to
to get from it even if token is granted.

* Also following from that -- we should write things to allow for a session to

  be sufficient for allowing users to perform actions.  access tokens describe

  a subset of the functions that the user themselves is allowed to perform.

* Client side -- we want to have different permissions if the user is running

  the cli from the command line vs running the cli from a cron job.  A user

  running from the cli could be said to have a session.

Hmm... I'm not sure we can really prevent user from running the exact same cmd-line from a terminal
to a cron tab.
Unless having strong policies on user/admin's operation. SOP!

IRC log (since this is all paraphrased and I could have misunderstood what puiterwijk meant):

http://toshio.fedorapeople.org/puiterwijk-oauth.html

-- 
Xavier 

_______________________________________________
infrastructure mailing list
infrastructure@xxxxxxxxxxxxxxxxxxxxxxx
https://admin.fedoraproject.org/mailman/listinfo/infrastructure