Querying on partitioned tables

"Peter Vanderborght" <peter.vanderborght@xxxxxxxxx> · Wed, 15 Oct 2008 11:03:59 +0200

Hi,

I've recently split my log table into time-based partitions, which really
improves insert speed and query times for certain queries.
However, I can't help thinking the query optimizer is really suboptimal
here.

My partitions look like this:

CREATE TABLE log_cdf
(
  id serial NOT NULL,
  tstamp timestamp without time zone,
  peopleid integer,
  room character varying(50),
  event character varying(50),
  "type" character varying(50),
  contentid integer,
  CONSTRAINT pk_log_cdf PRIMARY KEY (id)
)
WITH (OIDS=TRUE);
ALTER TABLE log_cdf OWNER TO postgres;
GRANT ALL ON TABLE log_cdf TO postgres;
GRANT SELECT ON TABLE log_cdf TO public;

CREATE TABLE log_cdf_200810 (
    CHECK ( tstamp >= DATE '2008-10-01' AND tstamp < DATE '2008-11-01' )
) INHERITS (log_cdf);
CREATE TABLE log_cdf_200811 (
    CHECK ( tstamp >= DATE '2008-11-01' AND tstamp < DATE '2008-12-01' )
) INHERITS (log_cdf);
CREATE TABLE log_cdf_200812 (
    CHECK ( tstamp >= DATE '2008-12-01' AND tstamp < DATE '2009-01-01' )
) INHERITS (log_cdf);
CREATE TABLE log_cdf_200901 (
    CHECK ( tstamp >= DATE '2009-01-01' AND tstamp < DATE '2009-02-01' )
) INHERITS (log_cdf);

CREATE INDEX idx_log_cdf_200810_tstamp ON log_cdf_200810 USING btree
(tstamp);
CREATE INDEX idx_log_cdf_200811_tstamp ON log_cdf_200811 USING btree
(tstamp);
CREATE INDEX idx_log_cdf_200812_tstamp ON log_cdf_200812 USING btree
(tstamp);
CREATE INDEX idx_log_cdf_200901_tstamp ON log_cdf_200901 USING btree
(tstamp);

And ofcourse I've added the matching trigger function as well.
On top of that, I have my old table which I've renamed to log_cdf_old, added
a check constraint for tstamp < DATE '2008-10-01' and set to inherit
log_cdf.

--------------
  PROBLEM 1
--------------
But now, simple queries like:

	Select * 
	from log_cdf 
	Order by tstamp desc
	Limit 100

All of a sudden take ages to complete and the query plan looks absolutely
awful:

Limit  (cost=8333060.45..8333060.70 rows=100 width=374)
  ->  Sort  (cost=8333060.45..8676529.57 rows=137387645 width=374)
        Sort Key: public.log_cdf.tstamp
        ->  Result  (cost=0.00..3082203.45 rows=137387645 width=374)
              ->  Append  (cost=0.00..3082203.45 rows=137387645 width=374)
                    ->  Seq Scan on log_cdf  (cost=0.00..12.00 rows=200
width=374)
                    ->  Seq Scan on log_cdf_old log_cdf
(cost=0.00..2915379.36 rows=129675136 width=57)
                    ->  Seq Scan on log_cdf_200810 log_cdf
(cost=0.00..166776.09 rows=7711709 width=58)
                    ->  Seq Scan on log_cdf_200811 log_cdf
(cost=0.00..12.00 rows=200 width=374)
                    ->  Seq Scan on log_cdf_200812 log_cdf
(cost=0.00..12.00 rows=200 width=374)
                    ->  Seq Scan on log_cdf_200901 log_cdf
(cost=0.00..12.00 rows=200 width=374)

This while

	analyse select * from log_cdf_200810 order by tstamp desc limit 100 

	Limit  (cost=0.00..7.51 rows=100 width=58)
	  ->  Index Scan Backward using idx_log_cdf_200810_tstamp on
log_cdf_200810  (cost=0.00..579351.47 rows=7711021 width=58)

Is much, much better.

--------------
  PROBLEM 2
--------------
I also notice that the query planner doesn't take time variables (like
LOCALTIMESTAMP or now() ) into account at all.
Look at this:

	select * 
	from log_cdf 
	where tstamp > '2008-10-10' -- 5 days ago
	  and tstamp < '2008-10-15' -- today

Gives the plan:

Result  (cost=0.00..205388.86 rows=3144503 width=374)
  ->  Append  (cost=0.00..205388.86 rows=3144503 width=374)
        ->  Seq Scan on log_cdf  (cost=0.00..13.00 rows=1 width=374)
              Filter: ((tstamp > '2008-10-10 00:00:00'::timestamp without
time zone) AND (tstamp < '2008-10-15 00:00:00'::timestamp without time
zone))
        ->  Seq Scan on log_cdf_200810 log_cdf  (cost=0.00..205375.86
rows=3144502 width=58)
              Filter: ((tstamp > '2008-10-10 00:00:00'::timestamp without
time zone) AND (tstamp < '2008-10-15 00:00:00'::timestamp without time
zone))

Which for all intents and purposes is exactly what I'd expect.

But now

	select * 
	from log_cdf 
	where tstamp > LOCALTIMESTAMP - interval '5 days'
	  and tstamp < LOCALTIMESTAMP

Gives me the following plan:

Result  (cost=0.00..1161067.27 rows=3476371 width=374)"
  ->  Append  (cost=0.00..1161067.27 rows=3476371 width=374)"
        ->  Seq Scan on log_cdf  (cost=0.00..15.50 rows=1 width=374)"
              Filter: ((tstamp < ('now'::text)::timestamp without time zone)
AND (tstamp > (('now'::text)::timestamp without time zone - '5
days'::interval)))
        ->  Bitmap Heap Scan on log_cdf_old log_cdf
(cost=8054.10..909421.75 rows=378889 width=57)
              Recheck Cond: ((tstamp > (('now'::text)::timestamp without
time zone - '5 days'::interval)) AND (tstamp < ('now'::text)::timestamp
without time zone))
              ->  Bitmap Index Scan on idx_log_cdf_old_tstamp
(cost=0.00..7959.37 rows=378889 width=0)
                    Index Cond: ((tstamp > (('now'::text)::timestamp without
time zone - '5 days'::interval)) AND (tstamp < ('now'::text)::timestamp
without time zone))
        ->  Bitmap Heap Scan on log_cdf_200810 log_cdf
(cost=76722.53..251605.18 rows=3097478 width=58)
              Recheck Cond: ((tstamp > (('now'::text)::timestamp without
time zone - '5 days'::interval)) AND (tstamp < ('now'::text)::timestamp
without time zone))  		            ->  Bitmap Index Scan on
idx_log_cdf_200810_tstamp  (cost=0.00..75948.16 rows=3097478 width=0)
                    Index Cond: ((tstamp > (('now'::text)::timestamp without
time zone - '5 days'::interval)) AND (tstamp < ('now'::text)::timestamp
without time zone))
        ->  Index Scan using idx_log_cdf_200811_tstamp on log_cdf_200811
log_cdf  (cost=0.01..8.28 rows=1 width=374)
              Index Cond: ((tstamp > (('now'::text)::timestamp without time
zone - '5 days'::interval)) AND (tstamp < ('now'::text)::timestamp without
time zone))
        ->  Index Scan using idx_log_cdf_200812_tstamp on log_cdf_200812
log_cdf  (cost=0.01..8.28 rows=1 width=374)
              Index Cond: ((tstamp > (('now'::text)::timestamp without time
zone - '5 days'::interval)) AND (tstamp < ('now'::text)::timestamp without
time zone))
        ->  Index Scan using idx_log_cdf_200901_tstamp on log_cdf_200901
log_cdf  (cost=0.01..8.28 rows=1 width=374)
              Index Cond: ((tstamp > (('now'::text)::timestamp without time
zone - '5 days'::interval)) AND (tstamp < ('now'::text)::timestamp without
time zone))

Am I expecting too much or am I doing something wrong here?
Any help on how to improve would be appreciated -- or if one of the core
developers (Tom?) would care to tweak to optimiser a bit for a following
release...

Regards,
Peter

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general