Partitioned/inherited tables with check constraints causing slower query plans

Richard Jones <rj@xxxxxxxxxxxx> · Fri, 4 May 2012 16:01:00 +0100

Hi,
I'm seeing poor query performance using partitioned tables with check
constraints, seems like the plan is much worse than when querying the
individual partitions manually.

select version(); --> PostgreSQL 9.1.1 on x86_64-unknown-linux-gnu,
compiled by gcc (Debian 4.4.5-8) 4.4.5, 64-bit

uname -a --> Linux  2.6.32-5-amd64 #1 SMP Mon Oct 3 03:59:20 UTC 2011
x86_64 GNU/Linux

(postgresql.conf included at the end)

Output of EXPLAIN ANALYZE follows, but here are the tables in question:

Here's the empty parent table:

 \d+ ircevents

  Column   |  Type   |       Modifiers
-----------+---------+------------------------
 buffer    | integer | not null
 id        | bigint  | not null  <----------------  microsecs since 1970-01-01
 type      | text    | not null
 highlight | boolean | not null default false
 json      | text    | not null
...
Child tables: ircevents_201008,
              ircevents_201009,
            ...
              ircevents_201211,
              ircevents_201212

And one example child table (they are all the same apart from
non-overlapping check constraints):

\d+ ircevents_201204
                    Table "public.ircevents_201204"
  Column   |  Type   |       Modifiers        | Storage  | Description
-----------+---------+------------------------+----------+-------------
 buffer    | integer | not null               | plain    |
 id        | bigint  | not null               | plain    |
 type      | text    | not null               | extended |
 highlight | boolean | not null default false | plain    |
 json      | text    | not null               | extended |
Indexes:
    "ircevents_201204_idx" UNIQUE, btree (buffer, id)
    "ircevents_201204_highlight_idx" btree (highlight) WHERE highlight
= true
Check constraints:
    "ircevents_201204_id_check" CHECK (id >= (date_part('epoch'::text,
'2012-04-01 00:00:00'::timestamp without time zone)::bigint * 1000000)
AND id < (date_part('epoch'::text, '2012-05-01 00:00:00'::timestamp
without time zone)::bigint * 1000000))
Inherits: ircevents

The tables experience heavy insert/select load for the month in
question, then less selects after that. update/delete to these tables
is very rare.
The ircevents_201204 table has ~200 million rows

Let's use a 20-day range spanning only one month:

ircevents=# select date_part('epoch', '2012-04-02'::timestamp without
time zone)::bigint * 1000000;
     ?column?
------------------
 1333317600000000
(1 row)

ircevents=# select date_part('epoch', '2012-04-22'::timestamp without
time zone)::bigint * 1000000;
     ?column?
------------------
 1335045600000000
(1 row)

The next two queries are the crux of the problem for me:

EXPLAIN ANALYZE
SELECT id, type, json
FROM ircevents
WHERE
buffer = 116780 AND
id BETWEEN 1333317600000000 AND 1335045600000000
ORDER BY id DESC limit 100;

    QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=80506.28..80506.53 rows=100 width=134) (actual
time=0.179..0.196 rows=40 loops=1)
   ->  Sort  (cost=80506.28..80558.01 rows=20692 width=134) (actual
time=0.178..0.185 rows=40 loops=1)
         Sort Key: public.ircevents.id
         Sort Method: quicksort  Memory: 33kB
         ->  Result  (cost=0.00..79715.45 rows=20692 width=134)
(actual time=0.039..0.121 rows=40 loops=1)
               ->  Append  (cost=0.00..79715.45 rows=20692 width=134)
(actual time=0.037..0.111 rows=40 loops=1)
                     ->  Seq Scan on ircevents  (cost=0.00..0.00
rows=1 width=72) (actual time=0.001..0.001 rows=0 loops=1)
                           Filter: ((id >= 1333317600000000::bigint)
AND (id <= 1335045600000000::bigint) AND (buffer = 116780))
                     ->  Bitmap Heap Scan on ircevents_201204
ircevents  (cost=914.36..79715.45 rows=20691 width=134) (actual
time=0.035..0.103 rows=40 loops=1)
                           Recheck Cond: ((buffer = 116780) AND (id >=
1333317600000000::bigint) AND (id <= 1335045600000000::bigint))
                           ->  Bitmap Index Scan on
ircevents_201204_idx  (cost=0.00..909.18 rows=20691 width=0) (actual
time=0.023..0.023 rows=40 loops=1)
                                 Index Cond: ((buffer = 116780) AND
(id >= 1333317600000000::bigint) AND (id <= 1335045600000000::bigint))
 Total runtime: 0.243 ms
(13 rows)

(note that the above plan demonstrates that constraint exclusion is
active, since it only queries the empty parent table, and the
appropriate partitioned table)

Compare the cost of that vs. specifying the partitioned table manually:

EXPLAIN ANALYZE
SELECT id, type, json
FROM ircevents_201204
WHERE buffer = 116780 AND
id BETWEEN 1333317600000000 AND 1335045600000000
ORDER BY id DESC limit 100;

    QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..404.02 rows=100 width=134) (actual
time=0.024..0.071 rows=40 loops=1)
   ->  Index Scan Backward using ircevents_201204_idx on
ircevents_201204  (cost=0.00..83595.11 rows=20691 width=134) (actual
time=0.023..0.062 rows=40 loops=1)
         Index Cond: ((buffer = 116780) AND (id >=
1333317600000000::bigint) AND (id <= 1335045600000000::bigint))
 Total runtime: 0.102 ms
(4 rows)

Querying the partition directly uses an "index scan backward", which
seems the best approach.

I see similar plans if the id range spans multiple tables - and I get
a much more efficient plan if manually construct a query by UNIONing
all the relevant partitions together.

Is there anything I can do to make querying to parent table in this
fashion use "Index Scan Backward" on the appropriate partitions, and
thus be as fast as querying the partitions directly?

Thanks,

RJ

PS Here is my postgresql.conf: (the server has 16GB/ and two mirrored
pairs of disks with pg_xlog on a different pair to the data)

listen_addresses = '*'
max_connections = 20                    # (change requires restart)
shared_buffers = 3GB                    # min 128kB
work_mem = 50MB                         # min 64kB
maintenance_work_mem = 500MB            # min 1MB "50mb per gig, ish"
synchronous_commit = off                # synchronization level; on,
off, or local
wal_buffers = 16MB                      # min 32kB, -1 sets based on
shared_buffers
checkpoint_segments = 100               # in logfile segments, min 1, 16MB each
effective_cache_size = 8GB              # how much is the OS gonna use
for caching disk stuff?
log_min_duration_statement = 500        # -1 is disabled, 0 logs all statements
track_activities = on
track_counts = on
track_functions = pl                    # none, pl, all
update_process_title = on
datestyle = 'iso, mdy'
lc_messages = 'C'                       # locale for system error message
lc_monetary = 'C'                       # locale for monetary formatting
lc_numeric = 'C'                        # locale for number formatting
lc_time = 'C'                           # locale for time formatting
default_text_search_config = 'pg_catalog.english'

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance