So in the past week a bunch of us have been talking about API Keys, OAuth, passwords, and other means of managing authn and authz in the web apps that are up and coming (specifically mentioned were copr and datagrepper). Puiterwijk has put in some time reading the OAuth specifications and on Friday he walked me through how OAuth is supposed to work. I'll give a summary of his talkl here and then we can kick off some discussion. OAuth is a standardized method for a user to grant access to resources that they own to people and things that are not themselves. Currently this is being used to allow a user to control the access to data and actions that may be performed on one web service by another web service. The concepts and mechanisms can be used in any situation where the user wants to limit what the software they are using can do on their behalf. = Part I: What is OAuth? = == Let's name all the things == * Protected Resource is either data or a function that you want to use. * Resource Server is the server that hosts the data. * Client is the program that needs access to the Protected Resource. * Resource owner is the person (may also be a system but well concentrate on actual people for this summary) who is authorized to grant clients access to the Resource. * Authorization Server is the server that grants tokens and codes. .. note:: `access` is used to mean that the client can use the protected resource. That usage might cause changes to data or cause other actions to be taken (like kicking off a build). be careful not to read `access` as "ability to read the data". Example time: There's a resource called 'full name of toshio'. It's hosted on the Fedora Account System so FAS is the resource server. Since toshio is my account, I am the resource owner for it. When I log into the PackageDB, PackageDB wants to display my full name to greet me. PackageDB needs to contact FAS for that information. Thus PackageDB is the client. If we create an OAuth server, that server will be the Authorization Server that verifies my identity and asks me to grant access to 'full name of toshio' to the PackageDB. Once I've done so, it will issue the tokens and codes that actually let the PackageDB get the information it wants from FAS. == Flow of a basic request from start to finish == * The client program needs to get access to a protected resource. * The client asks the authorization server for a client-id and tells the server which permissions it needs * The authrization server gives a url to the client * The client program redirects the user to that url so the user can grant permissions to the client * The authorization server authenticates the user (ie: they login to the authorization server). * The authorization server asks the user to confirm they want to grant the requested permissions to the client. * If the answer is no, the protocol ends. * If he answer is yes, the user is redirected to the client with an `authorization code` in the request * The client sends the `authorization code` to the authorization server. * The authorization server generates an `access token` with the specific permissions that the client requested, expires the `authorization code`, and returns the `access token` to the client, * The client requests the protected resource from the resource server using the `access token`. * The resource server verifes that the `access token` is valid. If it is, it allows access .. note:: the `authorization code` is only good for retrieving a single `access token` for the particular set of permissions that the user confirmed. .. note:: A client can request access to multiple resources at once. Assuming the resource owner accepted all of them, the access token the client receives at the end will allow access to all of those. A client typically has one access token from an authorization server that grants it all needed permissions on all of the resource servers that the authorization server can give out permissions for. It is possible for a client to have multiple access tokens with different permissions from the same authorization server but the client would have to keep track of which permissions were granted by which token (and the user would have had to confirm that the client should be granted each set of permissions). .. question:: an access token can contain permissions for multiple resource servers. How do we secure the token from being used maliciously by a different resource server? ie: I get an access token which grants some permissions on both fas and bodhi. I send that access token to fas to retrieve some information. What prevents fas from hanging onto that token and using it to access the protected resources on bodhi that it grants without my knowledge? == But wait, there's more! == We've now seen one authorization via oauth. But Oauth is flexible. There's a few different ways this can work to be aware of: * Other ways to request the access token. The example above is what works best for third-party web clients. However, there's other flows that might work better for CLI apps or "trusted" web clients - Implicit: user gets the access token directly from the authorization server rather than through a authorization code. This sortcut is useful when the client is entirely in the browser (no third-party server involved). With a third party server, the authorization code makes it so the user never sees the actual access token, only the authorization code. if the client is running on the user's machine anyhow, there's no sense in that step. - Resource owner password credentials: The resource owner provides their credentials (username and password) to the client. The client retrieves the access token from the authorization server using the credentials. Then it discards the credentials and only keeps the access token for further requests. - Client credentials: Just defines that if the client is the resource server, it can authenticate itself to access its own resources... I'm a little unclear on this but I think one use would be for a resource server to use its externally available functions (which are protected by oauth) rather than having to write an equivalent function that is usable internally. puiterwijk mentions a different use: having a strict separation between tenants in the resource server's model and then having to prove you have permission to access the resource from a different tenant (not something we're likely to do). * Verification of the access token can take many forms. - The authorization server could notify the resource server whenever a new access token is issued/revoked - The resource server could ask the authorization server to verify the token each time it receives one - The token could be signed by the auth server and thus be verifiable in and of itself. The token could then contain the list of permissions so that the resource server would just consult the token to know what was available. This should not be preferred as it makes revoking a token harder. * The authorization server may or may not know about the range of permissions that it can grant. The resource server needs to interpret what the permissions the access token grants mean so if the authorization server grants a made-up permission the application should just ignore it. .. question:: Is it possible for the user to grant some of the requested permissions and deny others? Or is it all or nothing? == Refreshing a token and its caveats == An access token can have an expire time. The expire time can be coupled with a second token called a refresh token. Usually the refresh token would expire sometime after the access token would expire. When the access token expires, the refresh token could be used by the client to request a new access token without prompting the user. This is indended to protect against an attacker who is sniffing packets from amassing enough ciphertext from multiple uses of a single access token to be able to brute force that token. This sort of automatic expiration and refresh **is not** meant to protect the user in case the access token is copied without their knowledge (because the refresh token can be copied at the same time). = Part II: How do we use this? = This section is less about OAuth itself but some proposals about how we can best code OAuth usage in our web applications to be secure and featureful. == Session vs token == Currently we have a concept of a session in all of our web apps. You login. Once you're logged in, the web app knows that future connections from your web browser/CLI/etc are being made by you. At some point the session expires or you explicitly log out. At that point, the session is over. The expiration time for most of our apps is currently 20 minuts of idle time but we've talked about increasing this in the past. Sessions in my mind should last tens of minutes to hours. Certainly no more than a day. A session conceptually tells the server that the user is present and interacting with the website (by saying that the user has "recently" authenticated). Tokens are more akin to passwords coupled with a restricted set of permissions. They're intended to be valid for days to weeks. Refresh tokens can (but don't necessarily) be used to keep a low amount of ciphertext in the system while still making authentication via access token transparent to the client. Conceptually, they tell the server that the **client** (not user) is the same one that was granted the permissions. === Using tokens to implement sessions === * Sessions need to be short term -- expiration would need to be low (perhaps an hour). No possibility to refresh the token. If you need to continue, you have to re-send your username + password (+ otp?) * We want this specific token to represent that the user is present, not just that the client has been delegated permissions. * It would make sense for the token to give out all permissions that the user has (at least, on this resource server) because the user is present. Example token permission: "*@*" permissions token * If possible, saving this type of session token into a wallet/keyring would make sense as that would encrypt the on-disk representation. However, we'd also have to account for the fact that these services might not be present. * Suggested to have access tokens with validity of 5 minutes. refresh tokens of 20 minutes. This would approximate our current cookie-based idle timeout. .. question:: Can we also have a maximum number of refreshes or maximum time before the user has to reenter their credentials (username + password (+otp?)) == Some proposed best practices == * Oauth allows for very granular permissions. You could put a separate permission on each resource that a client can request. However, it doesn't require that you are granular or not because the application interprets the meaning of the permission. A lazy resource server could have a single permission that covered anything that can be performed on the server but this means that a stolen token can be used to do anything that that user could do on that resource server. We should attempt to identify common use cases and code separate permissions for them. ie: "building a package in a copr" would belong in a separate permission from "creating a new copr". * An access token should not be taken to represent the presence of the user. It means the user has delegated permission to perform this action to some "client". It is possible that the client is a command line app or an api and the user is interacting with it directly but it cannot be assunmed that this is the case. * Following from that, changing authentication methods, password, yubikey, security questions, etc should never be allowed via an access token. We want the user to be present to change these settings. * Tokens and sessions should not contain information about the authentication status. They should not contain what permissions are held or when the session expires. These are for the resource server and authorization server to determine. * Also following from that -- we should write things to allow for a session to be sufficient for allowing users to perform actions. access tokens describe a subset of the functions that the user themselves is allowed to perform. * Client side -- we want to have different permissions if the user is running the cli from the command line vs running the cli from a cron job. A user running from the cli could be said to have a session. IRC log (since this is all paraphrased and I could have misunderstood what puiterwijk meant): http://toshio.fedorapeople.org/puiterwijk-oauth.html -Toshio
Attachment:
pgpNxJ0fBF4qI.pgp
Description: PGP signature
_______________________________________________ infrastructure mailing list infrastructure@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/infrastructure