I am designing an application which requires fine-grained role-based
security, where every logical object in the system has an ACL which
expresses the permissions allowed by roles.
Implementing this on a high level is trivial, but it must scale, and
scale extremely well. I have some fairly good solutions, but I'm not
altogether satisfied with them, and a few extra eyeballs would be of
enormous help. Of particular concern are optimizations related to
filtering queries based on read permissions.
The model is as follows. The system is divided into realms, which are
completely isolated areas in which a specific security policy is
enforced; pretty much analogous to Windows domains or Kerberos
realms. Objects in the system live in these realms, and users have
permissions on these objects based on the role they are assigned in
the realm. For example, a user may have the admin role in one realm,
but an ordinary user role in another; admins may modify specific
objects, ordinary users only read.
The realm table looks a bit like this:
create table realms (
id serial primary key,
schema_id integer references schemas (id)
);
The types of roles that can be applied within any given realm is
defined by the realm schema. Schemas are domain-specific; for
example, one schema may define the roles "Guest", "Privileged",
"Administrator". Another schema may define the roles "Contributor",
"Editor", "Moderator". Roles from one schema may only be applied to
objects in realms that use this schema.
create table schemas (
id serial primary key,
name text
);
create table roles (
id serial primary key,
schema_id integer references schemas (id),
name text
);
Now, every object of every class has a master table, something like:
create table widgets (
id serial primary key,
realm_id integer references realms (id),
title text
);
create table gadgets (
...
);
Then we have a table of permissions expressing each role's
permissions on every object:
create table widget_permissions (
widget_id integer references widgets (id),
role_id integer references roles (id),
read boolean,
write boolean,
delete boolean,
...
);
create table gadget_permissions (
...
);
And finally, we have the concept of realm membership; every user can
be a member of any number of realms, and they are granted a single
role in each realm.
create table memberships (
id serial primary key,
realm_id integer references realms (id),
user_id integer references users (id),
role_id integer references roles (id)
);
As for quantities: There is a limited number (<10) of schemas, a
small number (<10) of roles per schema, a large (thousands) number of
realms, a large (<500) number realm memberships per user, and fairly
large (1 million+) number of objects.
Finding objects that are accessible by a given user involves a couple
of joins:
select ...
from widgets
inner join widget_permissions on widget_permissions.widget_id =
widgets.id
and widgets_permission.read = true
and widgets_role_id in (
select role_id
from memberships
where memberships.realm_id = widgets.realm_id
and memberships.user_id = :user
)
Not too bad, but those extra joins are incurred on every single
query. There is another downside, which is that the result set of
this query cannot return any information about the actual permissions
on the object; an additional query is needed for that. Since the
application relies on such data in order to display GUI elements
related to editing and so on, this is a significant problem.
At the moment, this system has been implemented quite successfully,
but with little thought to performance, optimistically assuming that
optimizations can come up later. Later, it turns out, is now. We are
looking at one optimization which makes the permission lists more
compact -- and integrated as part of the result set -- at the expense
of additional joins. We know that there is a small number of roles in
each realm; if we assume that each role has a unique number, then we
can use it as a bit index:
create table roles (
id serial primary key,
schema_id integer references schemas (id),
name text,
bit_index integer
);
Now we can express the entire ACL as a set of attributes on each object:
create table widgets (
id serial primary key,
realm_id integer references realms (id),
title text,
read_bitmap int,
write_bitmap int,
delete_bitmap int,
...
);
Now the join becomes:
select ...
from widgets
where exists (
select memberships.id
from memberships
inner join realms on realms.id = memberships.realm_id
inner join schemas on schemas.id = realms.schema_id
inner join roles on memberships_role.id
where memberships.user_id = :user
and widgets.read_bitmap & (1 << roles.bit_index) != 0
)
Now we can get read_bitmap, write_bitmap etc. as part of every result
set, and use this to introspect every object's ACL.
Of course, we know beforehand which realms a user is a member of, and
we know which schemas these realms use. We can cache this in the
application and reduce the lookup by dynamically optimizing the query:
select ...
from widgets
where (case realm_id
when 9 then read_bitmap & 3
when 43 then read_bitmap & 1
when 112 then read_bitmap & 6
...
else 0
) != 0
Not too bad; the problem here is that when the user is a member of
many realms, the case expression grows very long indeed. I have not
yet performed any synthetic tests to determine how fast PostgreSQL
can evaluate such large expressions. Unfortunately, I am using Rails/
ActiveRecord and are therefore not in a position to use prepared
statements to optimize the compilation and planning of these statements.
Note that I already rely on Memcached to cache certain database
information to reduce database lookups; in other words, I am
comfortable with solutions that don't rely exclusively on PostgreSQL.
I would be happy to implement stored procedures in C just to make
this really fast.
Any insight into how this model could be improved upon will be highly
appreciated.
Alexander.