Dear list, I am having problems selecting the 10 most recent rows from a large table (4.5M rows), sorted by a date column of that table. The large table has a column user_id which either should match a given user_id, or should match the column contact_id in a correlated table where the user_id of that correlated table must match the given user_id. The user_id values in the large table are distributed in such a way that in the majority of cases for a given user_id 10 matching rows can be found very early when looking at the table sorted by the date - propably within the first 1%. Sometimes the given user_id however would match any rows only very far towards the end of the sorted large table, or not at all. The query works fine for the common cases when matching rows are found early in the sorted large table, like this: testdb=# EXPLAIN ANALYZE SELECT * FROM large_table lt LEFT JOIN relationships r ON lt.user_id=r.contact_id WHERE r.user_id = 55555 OR lt.user_id = 55555 ORDER BY lt.created_at DESC LIMIT 10; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..33809.31 rows=10 width=646) (actual time=0.088..69.448 rows=10 loops=1) -> Nested Loop Left Join (cost=0.00..156983372.66 rows=46432 width=646) (actual time=0.082..69.393 rows=10 loops=1) Filter: ((r.user_id = 55555) OR (lt.user_id = 55555)) -> Index Scan Backward using large_created_at_index on large_table lt (cost=0.00..914814.94 rows=4382838 width=622) (actual time=0.028..0.067 rows=13 loops=1) -> Index Scan using relationships_contact_id_index on relationships r (cost=0.00..35.33 rows=16 width=24) (actual time=0.017..2.722 rows=775 loops=13) Index Cond: (lt.user_id = r.contact_id) Total runtime: 69.640 ms but for the following user_id there are 3M rows in the large table which are more recent then the 10th matching one. The query then does not perform so well: testdb=# EXPLAIN ANALYZE SELECT * FROM large_table lt LEFT JOIN relationships r ON lt.user_id=r.contact_id WHERE r.user_id = 12345 OR lt.user_id = 12345 ORDER BY lt.created_at DESC LIMIT 10; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..33809.31 rows=10 width=646) (actual time=203454.353..425978.718 rows=10 loops=1) -> Nested Loop Left Join (cost=0.00..156983372.66 rows=46432 width=646) (actual time=203454.347..425978.662 rows=10 loops=1) Filter: ((r.user_id = 12345) OR (lt.user_id = 12345)) -> Index Scan Backward using large_created_at_index on large_table lt (cost=0.00..914814.94 rows=4382838 width=622) (actual time=0.031..78386.769 rows=3017547 loops=1) -> Index Scan using relationships_contact_id_index on relationships r (cost=0.00..35.33 rows=16 width=24) (actual time=0.006..0.060 rows=18 loops=3017547) Index Cond: (lt.user_id = r.contact_id) Total runtime: 425978.903 ms When split it up into the two following queries it performs much better for that user_id. Since the results of the two could be combined into the desired result, I'm assuming it could also be done efficiently within one query, if only a better plan would be used. testdb=# EXPLAIN ANALYZE SELECT * FROM large_table lt WHERE lt.user_id = 12345 ORDER BY lt.created_at DESC LIMIT 10; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..33.57 rows=10 width=622) (actual time=64.030..71.720 rows=10 loops=1) -> Index Scan Backward using large_user_id_created_at_index on large_table lt (cost=0.00..7243.59 rows=2158 width=622) (actual time=64.023..71.662 rows=10 loops=1) Index Cond: (user_id = 12345) Total runtime: 71.826 ms testdb=# EXPLAIN ANALYZE SELECT * FROM large_table lt WHERE user_id IN (SELECT contact_id FROM relationships WHERE user_id=12345) ORDER BY created_at DESC LIMIT 10; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=6902.52..6902.54 rows=10 width=622) (actual time=0.232..0.262 rows=4 loops=1) -> Sort (cost=6902.52..6905.57 rows=1220 width=622) (actual time=0.225..0.237 rows=4 loops=1) Sort Key: lt.created_at -> Nested Loop (cost=42.78..6839.98 rows=1220 width=622) (actual time=0.090..0.185 rows=4 loops=1) -> HashAggregate (cost=42.78..42.79 rows=1 width=4) (actual time=0.059..0.062 rows=1 loops=1) -> Bitmap Heap Scan on relationships (cost=4.34..42.75 rows=11 width=4) (actual time=0.038..0.041 rows=1 loops=1) Recheck Cond: (user_id = 12345) -> Bitmap Index Scan on relationships_user_id_index (cost=0.00..4.34 rows=11 width=0) (actual time=0.027..0.027 rows=1 loops=1) Index Cond: (user_id = 12345) -> Index Scan using large_user_id_started_at_index on large_table lt (cost=0.00..6768.48 rows=2297 width=622) (actual time=0.020..0.087 rows=4 loops=1) Index Cond: (lt.user_id = relationships.contact_id) Total runtime: 0.439 ms I'm not very experienced reading query plans and don't know how to go about this from here - is it theoretically possible to have a query that performs well with the given data in both cases or is there a conceptual problem? The database was freshly imported and ANALYZEd before running the above tests. I also tried the following for every involved column: increase statistics target, analyze the table, explain analyze the slow query, but the plan never changed. The relevant schema and indices portions are: testdb=# \d large_table Table "public.large_table" Column | Type | Modifiers -------------+-----------------------------+-------------------------------------------------------- id | integer | not null default nextval('large_id_seq'::regclass) user_id | integer | not null created_at | timestamp without time zone | Indexes: "large_pkey" PRIMARY KEY, btree (id) "large_created_at_index" btree (created_at) "large_user_id_created_at_index" btree (user_id, created_at) testdb=# \d relationships Table "public.relationships" Column | Type | Modifiers ------------+-----------------------------+------------------------------------------------------------ id | integer | not null default nextval('relationships_id_seq'::regclass) user_id | integer | contact_id | integer | Indexes: "relationships_pkey" PRIMARY KEY, btree (id) "relationships_contact_id_index" btree (contact_id) "relationships_user_id_index" btree (user_id, contact_id) testdb=# select tablename, attname, null_frac, avg_width, n_distinct, correlation from pg_stats where tablename in ('large_table', 'relationships'); tablename | attname | null_frac | avg_width | n_distinct | correlation -----------------+-------------+------------+-----------+------------+------------- relationships | id | 0 | 4 | -1 | 0.252015 relationships | user_id | 0 | 4 | 3872 | 0.12848 relationships | contact_id | 0 | 4 | 3592 | 0.6099 large_table | id | 0 | 4 | -1 | 0.980048 large_table | user_id | 0 | 4 | 1908 | 0.527973 large_table | created_at | 0 | 8 | 87636 | 0.973318 select version(); version ----------------------------------------------------------------------------------------------- PostgreSQL 8.2.4 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.1.2 (Ubuntu 4.1.2-0ubuntu4) (1 row) grateful for any advice, Til ---------------------------(end of broadcast)--------------------------- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq