Re: LYDB: Feasible to use PG roles instead of application-level security?

Melvin Davidson <melvin6925@xxxxxxxxx> · Fri, 30 Dec 2016 11:06:41 -0500

On Fri, Dec 30, 2016 at 10:23 AM, Stephen Frost <sfrost@xxxxxxxxxxx> wrote:
Greetings,

* Guyren Howe (guyren@xxxxxxxxx) wrote:

> it occurs to me to wonder whether it is practical to use PG’s own roles and security model in lieu of using an application-level one.

The short answer is yes.

> It seems that the role system in PG is sufficient for most general purposes. One could presumably also have a table with role names and associated metainformation (email address etc) as needed.

Yup.  That can get a bit awkward if you have multiple databases inside

of a single cluster, as you would have to pick which database to put

that metainformation in, but that isn't a very big issue.

> If I have a system with many thousands of users, is it practical to manage these users’ authentication and authorization using *just* Postgres?

For this, it really depends on if the PG authorization model matches the

requirements you have.  The PG auth model, particularly with RLS, is

extremely flexible but you would really need to evaluate what the exact

requirements are and how you would handle that with the PG auth model.

Of course, if there are just a few exceptions or complicated cases that

can't be satisfied directly with PG today, you could use security

definer functions.

One area that isn't fully addressed with the PG auth model today is

partial access to a certain column.  Consider a table where you want

users to have access to all of the rows and all of the columns *except*

for column X for rows where ID is > 1000.  The PG auth model today can

be used to say "you can't access column X" or to say "you can't access

rows where ID > 1000" but you can't combine those, yet.

I'm hopeful that we'll get there as there are definitely use-cases for

that kind of access control, but it's unlikely to happen for PG10.

> It occurs to me that some client frameworks might have issues with their connection pools if those connections keep switching users, assuming they even can, but let’s set that aside for now. Or perhaps every connection could immediately do a SET USER before executing its connection?

Again, yes, connection poolers can be an issue, but it's possible to use

the role system and do a 'set role X' after having connected as some

user that has very little access.  The issue here is controlling that

role change- there's no direct way in PG today to require a password to

be provided when doing the role change, which is unfortunate.  One

approach to solving that with RLS is to use a security definer function

to change a record in a table that is then used in all RLS policies.

It's a bit complicated and doesn't involve doing 'set role' though, so

there are some trade-offs there.

If you really want connection pooling and independent users in PG's role

system then you'll end up having to have the app code do the

authentication (or maybe auth to PG as the user and, if successful,

reconnect as the regular user and set role...  that's pretty awkward

though) and then connect and do the 'set role'.

One big question here, however, is if you're going to have thousands of

*concurrently connected* users.  Thousands of users shouldn't be too

much of an issue, but if they're all connected using PG's main auth

system then you'll have thousands of backend processes running.  That'll

end up causing some amount of overhead even if they're mostly idle.  If

your application can handle connecting/disconnecting pretty easily and

you have a relativly short timeout (though, ideally, not too short) then

perhaps your number of concurrent connections won't be too bad.

> This seems an attractive proposition from a security standpoint: if I use row-level security pervasively, I can have a security system that’s nestled nice and close to the data and presumably tricky to work around from a hacker given direct access only to the client application.

If that's the threat model you want to address then you'll have to work

out the concurrent connections question.  One thing which can help is to

use a common user for 'read-only/public-access (or at least low-value)'

queries from the app, if there are such.

> Is this practical? Has anyone here done it? What might the caveats be?

Yes, yes, see above.

Thanks!

Stephen

>Postgres roles are global to the cluster,
Well, that is true by default, however, you  can make roles database specific:
https://www.postgresql.org/docs/9.4/static/runtime-config-connection.html#RUNTIME-CONFIG-CONNECTION-SECURITY
db_user_namespace = on

That being said, there is a trade off of managing multiple users & passwords VS simple access roles. 
-- 
Melvin Davidson
I reserve the right to fantasize.  Whether or not you 
 wish to share my fantasy is entirely up to you.