Hi, I'm seeing poor query performance using partitioned tables with check constraints, seems like the plan is much worse than when querying the individual partitions manually. select version(); --> PostgreSQL 9.1.1 on x86_64-unknown-linux-gnu, compiled by gcc (Debian 4.4.5-8) 4.4.5, 64-bit uname -a --> Linux 2.6.32-5-amd64 #1 SMP Mon Oct 3 03:59:20 UTC 2011 x86_64 GNU/Linux (postgresql.conf included at the end) Output of EXPLAIN ANALYZE follows, but here are the tables in question: Here's the empty parent table: \d+ ircevents Column | Type | Modifiers -----------+---------+------------------------ buffer | integer | not null id | bigint | not null <---------------- microsecs since 1970-01-01 type | text | not null highlight | boolean | not null default false json | text | not null ... Child tables: ircevents_201008, ircevents_201009, ... ircevents_201211, ircevents_201212 And one example child table (they are all the same apart from non-overlapping check constraints): \d+ ircevents_201204 Table "public.ircevents_201204" Column | Type | Modifiers | Storage | Description -----------+---------+------------------------+----------+------------- buffer | integer | not null | plain | id | bigint | not null | plain | type | text | not null | extended | highlight | boolean | not null default false | plain | json | text | not null | extended | Indexes: "ircevents_201204_idx" UNIQUE, btree (buffer, id) "ircevents_201204_highlight_idx" btree (highlight) WHERE highlight = true Check constraints: "ircevents_201204_id_check" CHECK (id >= (date_part('epoch'::text, '2012-04-01 00:00:00'::timestamp without time zone)::bigint * 1000000) AND id < (date_part('epoch'::text, '2012-05-01 00:00:00'::timestamp without time zone)::bigint * 1000000)) Inherits: ircevents The tables experience heavy insert/select load for the month in question, then less selects after that. update/delete to these tables is very rare. The ircevents_201204 table has ~200 million rows Let's use a 20-day range spanning only one month: ircevents=# select date_part('epoch', '2012-04-02'::timestamp without time zone)::bigint * 1000000; ?column? ------------------ 1333317600000000 (1 row) ircevents=# select date_part('epoch', '2012-04-22'::timestamp without time zone)::bigint * 1000000; ?column? ------------------ 1335045600000000 (1 row) The next two queries are the crux of the problem for me: EXPLAIN ANALYZE SELECT id, type, json FROM ircevents WHERE buffer = 116780 AND id BETWEEN 1333317600000000 AND 1335045600000000 ORDER BY id DESC limit 100; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=80506.28..80506.53 rows=100 width=134) (actual time=0.179..0.196 rows=40 loops=1) -> Sort (cost=80506.28..80558.01 rows=20692 width=134) (actual time=0.178..0.185 rows=40 loops=1) Sort Key: public.ircevents.id Sort Method: quicksort Memory: 33kB -> Result (cost=0.00..79715.45 rows=20692 width=134) (actual time=0.039..0.121 rows=40 loops=1) -> Append (cost=0.00..79715.45 rows=20692 width=134) (actual time=0.037..0.111 rows=40 loops=1) -> Seq Scan on ircevents (cost=0.00..0.00 rows=1 width=72) (actual time=0.001..0.001 rows=0 loops=1) Filter: ((id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint) AND (buffer = 116780)) -> Bitmap Heap Scan on ircevents_201204 ircevents (cost=914.36..79715.45 rows=20691 width=134) (actual time=0.035..0.103 rows=40 loops=1) Recheck Cond: ((buffer = 116780) AND (id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint)) -> Bitmap Index Scan on ircevents_201204_idx (cost=0.00..909.18 rows=20691 width=0) (actual time=0.023..0.023 rows=40 loops=1) Index Cond: ((buffer = 116780) AND (id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint)) Total runtime: 0.243 ms (13 rows) (note that the above plan demonstrates that constraint exclusion is active, since it only queries the empty parent table, and the appropriate partitioned table) Compare the cost of that vs. specifying the partitioned table manually: EXPLAIN ANALYZE SELECT id, type, json FROM ircevents_201204 WHERE buffer = 116780 AND id BETWEEN 1333317600000000 AND 1335045600000000 ORDER BY id DESC limit 100; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.00..404.02 rows=100 width=134) (actual time=0.024..0.071 rows=40 loops=1) -> Index Scan Backward using ircevents_201204_idx on ircevents_201204 (cost=0.00..83595.11 rows=20691 width=134) (actual time=0.023..0.062 rows=40 loops=1) Index Cond: ((buffer = 116780) AND (id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint)) Total runtime: 0.102 ms (4 rows) Querying the partition directly uses an "index scan backward", which seems the best approach. I see similar plans if the id range spans multiple tables - and I get a much more efficient plan if manually construct a query by UNIONing all the relevant partitions together. Is there anything I can do to make querying to parent table in this fashion use "Index Scan Backward" on the appropriate partitions, and thus be as fast as querying the partitions directly? Thanks, RJ PS Here is my postgresql.conf: (the server has 16GB/ and two mirrored pairs of disks with pg_xlog on a different pair to the data) listen_addresses = '*' max_connections = 20 # (change requires restart) shared_buffers = 3GB # min 128kB work_mem = 50MB # min 64kB maintenance_work_mem = 500MB # min 1MB "50mb per gig, ish" synchronous_commit = off # synchronization level; on, off, or local wal_buffers = 16MB # min 32kB, -1 sets based on shared_buffers checkpoint_segments = 100 # in logfile segments, min 1, 16MB each effective_cache_size = 8GB # how much is the OS gonna use for caching disk stuff? log_min_duration_statement = 500 # -1 is disabled, 0 logs all statements track_activities = on track_counts = on track_functions = pl # none, pl, all update_process_title = on datestyle = 'iso, mdy' lc_messages = 'C' # locale for system error message lc_monetary = 'C' # locale for monetary formatting lc_numeric = 'C' # locale for number formatting lc_time = 'C' # locale for time formatting default_text_search_config = 'pg_catalog.english' -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance