Hi Postgres community,
We are experiencing some performance issues when RLS is enabled for large tables. With simplified example:
We have a table:
CREATE TABLE emp.employees ( employee_id INTEGER PRIMARY KEY, -- companies table are defined in a different schema, not accessible to emp service company_id INTEGER NOT NULL, employee_name TEXT NOT NULL );
Index for employees table:
CREATE INDEX employees_company_id_idx ON emp.employees (company_id);
And for the table we have RLS select policy:
CREATE POLICY employee_select_policy ON emp.employees FOR SELECT USING ( company_id = ANY(coalesce(string_to_array(current_setting('emp.authorized_read_company_ids', TRUE), ',')::INTEGER[], ARRAY []::INTEGER[])) );
When a very simple query is executed, for instance:
SET emp.authorized_read_company_ids = '1, 2, 3, ..., 200'; SELECT count(*) FROM emp.employees WHERE TRUE; -- 68091 rows
The query plan for this query reads:
Aggregate (cost=1096.02..1096.03 rows=1 width=8) (actual time=8.740..8.740 rows=1 loops=1) Output: count(*) Buffers: shared hit=778 -> Index Only Scan using employees_company_id_idx on emp.employees (cost=0.35..970.78 rows=50099 width=0) (actual time=0.124..4.976 rows=49953 loops=1) Output: company_id Index Cond: (employees.company_id = ANY (COALESCE((string_to_array(current_setting('emp.authorized_read_company_ids'::text, true), ','::text))::integer[], '{}'::integer[]))) Heap Fetches: 297 Buffers: shared hit=778 Planning: Buffers: shared hit=12 Planning Time: 0.824 ms Execution Time: 8.768 ms
The problem rises when we make the RLS select policy condition a bit more complicated by adding admin checks inside RLS select policy:
CREATE POLICY employee_select_policy ON emp.employees FOR SELECT USING ( coalesce(nullif(current_setting('emp.is_admin', TRUE), ''), 'false')::BOOLEAN OR company_id = ANY(coalesce(string_to_array(current_setting('emp.authorized_read_company_ids', TRUE), ',')::INTEGER[], ARRAY []::INTEGER[])) );
When the same simple query is executed:
SET emp.is_admin = TRUE; SET emp.authorized_read_company_ids = '1, 2, 3, ..., 200'; SELECT count(*) FROM emp.employees WHERE TRUE; -- 68091 rows
The query plan now reads:
Aggregate (cost=6238.51..6238.52 rows=1 width=8) (actual time=2156.271..2156.272 rows=1 loops=1) Output: count(*) Buffers: shared hit=367 -> Index Only Scan using employees_company_id_idx on emp.employees (cost=0.29..6099.16 rows=55740 width=0) (actual time=0.065..2151.939 rows=49953 loops=1) Output: company_id Filter: ((COALESCE(NULLIF(current_setting('emp.is_admin'::text, true), ''::text), 'false'::text))::boolean OR (employees.company_id = ANY (COALESCE((string_to_array(current_setting('emp.authorized_read_company_ids'::text, true), ','::text))::integer[], '{}'::integer[])))) Rows Removed by Filter: 11430 Heap Fetches: 392 Buffers: shared hit=367 Planning Time: 0.744 ms Execution Time: 2156.302 ms
We can see the performance has deteriorated horribly because the RLS is not using index any more for the company ids, the RLS scan happens for every single row in the result set against every single company id in the db context.
With the size of table and the number of company ids inside the db context growing, the execution time becomes longer and longer.
To summarise: We would like to have admin users run without any RLS restrictions, and normal users to have RLS enforced using an index based on company_ids. Unfortunately, we cannot have queries executed by admin users connect to the database as a different database user.
Is there anything you could suggest?
Thanks, Charles |