Search Postgresql Archives

Re: FW: Constraint exclusion in partitions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Oops, I was responding to the email below from melvin6925

 

 

From: pgsql-general-owner@xxxxxxxxxxxxxx [mailto:pgsql-general-owner@xxxxxxxxxxxxxx] On Behalf Of David G. Johnston
Sent: May-23-15 19:32
To: Daniel Begin
Cc: melvin6925; pgsql-general@xxxxxxxxxxxxxx
Subject: Re: [NOVICE] Constraint exclusion in partitions

 

On Saturday, May 23, 2015, Daniel Begin <jfd553@xxxxxxxxxxx> wrote:

I am working with postgresql 9.3 and I understand from the documentation that constraint_exclusion is set to “partition” by default. Looking at my postgres.conf file, the concerned line is “#constraint_exclusion = partition”. 

Furthermore, the execution plan shows that constraint_exclusion was used at least for constant id and range of ids

What is your question/concern? 

 

 

Did you remember to set  constraint_exclusion = on and reload the .conf ?

 

 

 

From: pgsql-general-owner@xxxxxxxxxxxxxx [mailto:pgsql-general-owner@xxxxxxxxxxxxxx] On Behalf Of melvin6925
Sent: May-23-15 15:15
To: Daniel Begin; pgsql-general@xxxxxxxxxxxxxx
Subject: Re: FW: Constraint exclusion in partitions

 

Did you remember to set  constraint_exclusion = on and reload the .conf ?

 

 

 

 

Sent via the Samsung Galaxy S® 6, an AT&T 4G LTE smartphone

-------- Original message --------
From: Daniel Begin <jfd553@xxxxxxxxxxx>
Date: 05/23/2015 14:37 (GMT-05:00)
To: pgsql-general@xxxxxxxxxxxxxx
Subject: Re: FW: Constraint exclusion in partitions

Following Francisco suggestion, I was able to do some tests earlier this morning when the partitioning process completed and all the resulting tables analyzed.

Here is what I got on both the original table and its partitioned counterpart while running the same queries. I tested them only for a couple of values but in summary...

Using a constant id:
All the queries I tried took longer on the partitioned table! I got similar results for multiple records using IN (id value1, id value2 ...)

Using a range of ids:
Surprisingly again, all the queries I tried took longer on the partitioned table!

Using a list of ids from a select clause:
More surprisingly, the queries I tried took less time on the partitioned table at least when using the primary key. Using an indexed field took so long compared to the old table that I cancelled the execution for the new one!

Guess what, I will get back to my old fat table unless someone tells me I missed something obvious!
Daniel

Note: Tables/indexes description, queries and execution plans are below.






Tables/indexes description ----------------------------------------------------------------------------------
The original table has 3870130000 records. Primary key/index on each partition queries are
ALTER TABLE oldtable ADD CONSTRAINT oldtable_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX oldtable_groupid_idx ON oldtable USING btree (group_id);

The partitioned table has 3870130000 records distributed over 87 partitions. Primary key/index on each partition queries are
ALTER TABLE newtable_xx ADD CONSTRAINT newtablexx_idversion_pk PRIMARY KEY (id, version);
CREATE INDEX newtablexx_groupid_idx ON newtable_xx USING btree (group_id);
Where xx is the partition's number suffix

constant id -------------------------------------------------------------------------------------------------------
select * from oldtable where id=123456789;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4437.15 rows=1682 width=66)"
"  Index Cond: (id = 123456789::bigint)"
--Total query runtime: 62 ms. 1 rows retrieved


select * from newtable where id=123456789;
"Append  (cost=0.00..20.19 rows=5 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = 123456789::bigint)"
"  ->  Index Scan using newtable72_idversion_pk on newtable_72  (cost=0.56..20.19 rows=4 width=66)"
"        Index Cond: (id = 123456789::bigint)"
--Total query runtime: 156 ms. 1 rows retrieved

I got similar results for multiple records...
select * from oldtable where id IN(10000000,1000000000,2000000000,3000000000);
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.71..17739.18 rows=6726 width=66)"
"  Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 187 ms. 4 rows retrieved

select * from newtable where id IN(10000000,1000000000,2000000000,3000000000);
"Append  (cost=0.00..933.40 rows=223 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
"  ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.57..622.78 rows=156 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
...
"  ->  Index Scan using newtable85_idversion_pk on newtable_85  (cost=0.57..53.37 rows=9 width=66)"
"        Index Cond: (id = ANY ('{10000000,1000000000,2000000000,3000000000}'::bigint[]))"
--Total query runtime: 421 ms. 4 rows retrieved


range of ids -------------------------------------------------------------------------------------------------------
select * from oldtable where id between 1522999949 and 1523000049;
"Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..383.51 rows=144 width=66)"
"  Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 47 ms. 53 rows retrieved.

select * from newtable where id between 1522999949 and 1523000049;
"Append  (cost=0.00..408.16 rows=104 width=66)"
"  ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        Filter: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable51_idversion_pk on newtable_51  (cost=0.56..183.52 rows=46 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
"  ->  Index Scan using newtable52_idversion_pk on newtable_52  (cost=0.56..224.64 rows=57 width=66)"
"        Index Cond: ((id >= 1522999949) AND (id <= 1523000049))"
Total query runtime: 78 ms. 53 rows retrieved.


list of ids from a select clause -------------------------------------------------------------------------------------------------------
--Subset provides 4 ids similar but not identical to the previous query
select * from oldtable where id IN (select * from subset);
"Nested Loop  (cost=37.45..886298.00 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_idversion_pk on oldtable  (cost=0.70..4414.37 rows=1693 width=66)"
"        Index Cond: (id = subset.id)"
Total query runtime: 171 ms. 4 rows retrieved.

select * from newtable where id IN (select * from subset)
"Nested Loop  (cost=36.75..1407672.76 rows=1935067087 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Append  (cost=0.00..7020.68 rows=1749 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"              Filter: (subset.id = id)"
"        ->  Index Scan using newtable01_idversion_pk on newtable_01  (cost=0.56..151.97 rows=39 width=66)"
"              Index Cond: (id = subset.id)"
...
"        ->  Index Scan using newtable86_idversion_pk on newtable_86  (cost=0.56..12.42 rows=2 width=66)"
"              Index Cond: (id = subset.id)"
Total query runtime: 140 ms. 4 rows retrieved.


Using an index, not the primary key ------------------------------------------------------------------------------
--Subset provides 58 group_id pointing to 5978 records in the concerned tables
select * from oldtable where group_id IN (select * from subset)
"Nested Loop  (cost=37.33..21575715.89 rows=2028512050 width=66)"
"  ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"        ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
"  ->  Index Scan using oldtable_groupid_idx on oldtable  (cost=0.58..107364.99 rows=51340 width=66)"
"        Index Cond: (group_id = subset.id)"
Total query runtime: 3986 ms. 5978 rows retrieved.


select * from newtable where group_id IN (select * from subset)
"Hash Join  (cost=41.25..138092255.85 rows=1935067087 width=66)"
"  Hash Cond: (newtable.group_id = subset.id)"
"  ->  Append  (cost=0.00..84877869.72 rows=3870134173 width=66)"
"        ->  Seq Scan on newtable  (cost=0.00..0.00 rows=1 width=66)"
"        ->  Seq Scan on newtable_01  (cost=0.00..946235.96 rows=46526896 width=66)"
...
"        ->  Seq Scan on newtable_86  (cost=0.00..986527.64 rows=44269664 width=66)"
"  ->  Hash  (cost=38.75..38.75 rows=200 width=8)"
"        ->  HashAggregate  (cost=36.75..38.75 rows=200 width=8)"
"              ->  Seq Scan on subset  (cost=0.00..31.40 rows=2140 width=8)"
Execution Cancelled after 766702 ms !

I tried the same with "SET enable_seqscan = OFF" and got an index scan of all tables;



--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux