Re: Shards + hash = forever running queries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Daniele Varrazzo <daniele.varrazzo@xxxxxxxxx> writes:
> Since "big" was sharded, the query plan results in something like:

>  Hash Join  (cost=10000000001.23..30038997974.72 rows=10 width=753)
>    Hash Cond: (b.id = i.big_id)
>    ->  Append  (cost=0.00..20038552251.23 rows=118859245 width=11)
>          ->  Index Scan using big_201207_pkey on big_201207 b
> (cost=0.00..2224100.46 rows=1609634 width=12)
>          ->  Index Scan using big_201101_pkey on big_201101 b
> (cost=0.00..404899.71 rows=5437497 width=12)
>          ->  Index Scan using big_201104_pkey on big_201104 b
> (cost=0.00..349657.58 rows=4625181 width=12)
>          -> [...all the shards]
>    ->  Hash  (cost=10000000001.10..10000000001.10 rows=10 width=742)
>          ->  Seq Scan on small i  (cost=10000000000.00..10000000001.10
> rows=10 width=742)

[ squint... ]  9.1 certainly ought to be able to find a smarter plan for
such a case.  For instance, if I try this on 9.1 branch tip:

regression=# create table p (id int primary key);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "p_pkey" for table "p"
CREATE TABLE
regression=# create table c1 (primary key (id)) inherits(p);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "c1_pkey" for table "c1"
CREATE TABLE
regression=# create table c2 (primary key (id)) inherits(p);
NOTICE:  CREATE TABLE / PRIMARY KEY will create implicit index "c2_pkey" for table "c2"
CREATE TABLE
regression=# explain select * from p,int4_tbl where id=f1;
                                   QUERY PLAN                                   
--------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..53.25 rows=120 width=8)
   Join Filter: (public.p.id = int4_tbl.f1)
   ->  Seq Scan on int4_tbl  (cost=0.00..1.05 rows=5 width=4)
   ->  Append  (cost=0.00..10.40 rows=3 width=4)
         ->  Index Scan using p_pkey on p  (cost=0.00..1.87 rows=1 width=4)
               Index Cond: (id = int4_tbl.f1)
         ->  Index Scan using c1_pkey on c1 p  (cost=0.00..4.27 rows=1 width=4)
               Index Cond: (id = int4_tbl.f1)
         ->  Index Scan using c2_pkey on c2 p  (cost=0.00..4.27 rows=1 width=4)
               Index Cond: (id = int4_tbl.f1)
(10 rows)

You have evidently got enable_seqscan turned off, so I wonder whether
the cost penalties applied by that are swamping the estimates.  Do you
get any better results if you re-enable that?

			regards, tom lane

-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux