Problem with partitionning and orderby query plans

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

In the same context that my  previous thread on this mailing list (the
database holding  500k articles of  a french daily newspaper),  we now
need to handle the users' comments on the articles (1 million for now,
quickly growing).

In our context, we'll have three kind of queries :

- queries on articles only ;

- queries on comments only ;

- queries on both articles and comments.

We tried to use the partitionning feature described at
http://www.postgresql.org/docs/8.4/static/ddl-partitioning.html , with three
tables :

- libeindex (master table, no data)

- libearticle (articles)

- libecontribution (comments)

The schema looks like :

CREATE TABLE libeindex (

  id integer,
  classname varchar(255),
  createdAt timestamp,
  modifiedAt timestamp,
...
  PRIMARY KEY (classname, id)
);


CREATE TABLE libecontribution (
  CHECK (classname = 'contribution'), 
  PRIMARY KEY (classname, id)
) INHERITS (libeindex) ;

CREATE TABLE libearticle (
  CHECK (classname = 'article'), 
  PRIMARY KEY (classname, id)
) INHERITS (libeindex) ;

With many indexes are created on the two subtables, including :
CREATE INDEX libearticle_createdAt_index ON libearticle (createdAt);
CREATE INDEX libearticle_class_createdAt_index ON libearticle (classname, createdAt);

The problem we  have is that with the  partionned table, PostgreSQL is
now unable  to use the "index  scan backwards" query plan  on a simple
"order by limit" query.

For example :

libepart=> explain analyze SELECT classname, id FROM libeindex WHERE (classname IN ('article')) ORDER BY createdAt DESC LIMIT 50;
                                                                       QUERY PLAN                                                                        
---------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=114980.14..114980.27 rows=50 width=20) (actual time=4070.953..4071.076 rows=50 loops=1)
   ->  Sort  (cost=114980.14..116427.34 rows=578878 width=20) (actual time=4070.949..4070.991 rows=50 loops=1)
         Sort Key: public.libeindex.createdat
         Sort Method:  top-N heapsort  Memory: 28kB
         ->  Result  (cost=0.00..95750.23 rows=578878 width=20) (actual time=0.068..3345.727 rows=578877 loops=1)
               ->  Append  (cost=0.00..95750.23 rows=578878 width=20) (actual time=0.066..2338.575 rows=578877 loops=1)
                     ->  Index Scan using libeindex_pkey on libeindex  (cost=0.00..8.27 rows=1 width=528) (actual time=0.011..0.011 rows=0 loops=1)
                           Index Cond: ((classname)::text = 'article'::text)
                     ->  Seq Scan on libearticle libeindex  (cost=0.00..95741.96 rows=578877 width=20) (actual time=0.051..1364.296 rows=578877 loops=1)
                           Filter: ((classname)::text = 'article'::text)
 Total runtime: 4071.195 ms
(11 rows)

libepart=> explain analyze SELECT classname, id FROM libearticle WHERE (classname IN ('article')) ORDER BY createdAt DESC LIMIT 50;
                                                                             QUERY PLAN                                                                             
--------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.00..9.07 rows=50 width=20) (actual time=0.033..0.200 rows=50 loops=1)
   ->  Index Scan Backward using libearticle_createdat_index on libearticle  (cost=0.00..105053.89 rows=578877 width=20) (actual time=0.030..0.112 rows=50 loops=1)
         Filter: ((classname)::text = 'article'::text)
 Total runtime: 0.280 ms
(4 rows)

As you can see, PostgreSQL  doesn't realize that the table "libeindex"
is in  fact empty, and  that it only  needs to query the  subtable, on
which it can use the "Index Scan Backward" query plan.

Is this a known limitation of the partionning method ? If so, it could
be interesting to mention it on  the documentation. If not, is there a
way to work around the problem ?

Regards,

-- 
Gaël Le Mignot - gael@xxxxxxxxxxxxxxxx
Pilot Systems - 9, rue Desargues - 75011 Paris
Tel : +33 1 44 53 05 55 - www.pilotsystems.net
Gérez vos contacts et vos newsletters : www.cockpit-mailing.com

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux