On Wed, Sep 15, 2010 at 2:26 AM, Anssi Kääriäinen <anssi.kaariainen@xxxxxx> wrote: > Hello all, > > I am trying to use aggregate queries in views, and when joining these views > to other > tables, I get seq scan in the view, even if index scan would be clearly > better. The views > I am using in my Db are actually long pivot queries, but the following > simple test case is enough > to show the problem. > > I will first show the table definitions, then the performance problem I am > having. > > create table test1 ( > id serial primary key not null, > other_id integer unique not null > ); > > create table test2 ( > id integer not null references test1(id), > type integer, > value text > ); > > create index test2_idx on test2(id); > > insert into test1 select g, g+10000 from (select generate_series(1, 10000) > as g) t; > insert into test2 select g, g%3, 'testval'||g from (select > generate_series(1, 10000) as g) t; > insert into test2 select g, (g+1)%3, 'testval'||g from (select > generate_series(1, 10000) as g) t; > insert into test2 select g, (g+2)%3, 'testval'||g from (select > generate_series(1, 10000) as g) t; > > Now, the following query is fast: > > select * from test1 inner join (select array_agg(value), id > from test2 group by id) t on test1.id = t.id where test1.id = 1; > (0.6ms) > > But the following query is slow (seqscan on test2): > > select * from test1 inner join (select array_agg(value), id > from test2 group by id) t on test1.id = t.id where test1.other_id = 10001; > (45ms) > > The same problem can be seen when running: > > select * from test1 inner join (select array_agg(value), id > from test2 group by id) t on test1.id = t.id where test1.id in (1, 2); > (40ms runtime) > > Fetching directly from test2 with id is fast: > > select array_agg(value), id > from test2 where test2.id in (1, 2) group by id; > > If I set enable_seqscan to off, then I get fast results: > > select * from test1 inner join (select array_agg(value), id > from test2 group by id) t on test1.id = t.id where test1.other_id in (10001, > 10002); > (0.6ms) > > Or slow results, if the fetched rows happen to be in the end of the index: > > select * from test1 inner join (select array_agg(value), id > from test2 group by id) t on test1.id = t.id where test1.other_id = 20000; > (40ms) > > Explain analyzes of the problematic query: > > With enable_seqscan: > > explain analyze select * from test1 inner join (select array_agg(value), id > from test2 group by id) t on test1.id = t.id where test1.other_id = 10001; > > Hash Join (cost=627.48..890.48 rows=50 width=44) (actual > time=91.575..108.085 rows=1 loops=1) > Hash Cond: (test2.id = test1.id) > -> HashAggregate (cost=627.00..752.00 rows=10000 width=15) (actual > time=82.663..98.281 rows=10000 loops=1) > -> Seq Scan on test2 (cost=0.00..477.00 rows=30000 width=15) > (actual time=0.009..30.650 rows=30000 loops=1) > -> Hash (cost=0.47..0.47 rows=1 width=8) (actual time=0.026..0.026 > rows=1 loops=1) > -> Index Scan using test1_other_id_key on test1 (cost=0.00..0.47 > rows=1 width=8) (actual time=0.018..0.021 rows=1 loops=1) > Index Cond: (other_id = 10001) > Total runtime: 109.686 ms > > Without enable_seqscan: > > explain analyze select * from test1 inner join (select array_agg(value), id > from test2 group by id) t on test1.id = t.id where test1.other_id = 10001; > > Merge Join (cost=0.48..895.91 rows=50 width=44) (actual time=0.066..0.085 > rows=1 loops=1) > Merge Cond: (test2.id = test1.id) > -> GroupAggregate (cost=0.00..769.56 rows=10000 width=15) (actual > time=0.040..0.054 rows=2 loops=1) > -> Index Scan using test2_idx on test2 (cost=0.00..494.56 > rows=30000 width=15) (actual time=0.017..0.030 rows=7 loops=1) > -> Sort (cost=0.48..0.48 rows=1 width=8) (actual time=0.020..0.022 > rows=1 loops=1) > Sort Key: test1.id > Sort Method: quicksort Memory: 17kB > -> Index Scan using test1_other_id_key on test1 (cost=0.00..0.47 > rows=1 width=8) (actual time=0.010..0.012 rows=1 loops=1) > Index Cond: (other_id = 10001) Take a look at this, and the responses. Is it the same case?: http://www.mail-archive.com/pgsql-performance@xxxxxxxxxxxxxx/msg21756.html merlin -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance