Re: Optimization idea

Vlad Arkhipov <arhipov@xxxxxxxxxxxx> · Mon, 26 Apr 2010 11:52:23 +0900

On Thu, Apr 22, 2010 at 10:37 PM, Vlad Arkhipov <arhipov@xxxxxxxxxxxx> wrote:

I don't think this is just an issue with statistics, because the same
problem arises when I try executing a query like this:

I'm not sure how you think this proves that it isn't a problem with
statistics, but I think what you should be focusing on here, looking
back to your original email, is that the plans that are actually much
faster have almost as much estimated cost as the slower one.  Since
all your data is probably fully cached, at a first cut, I might try
setting random_page_cost and seq_page_cost to 0.005 or so, and
adjusting effective_cache_size to something appropriate.

...Robert

Ok. I thougth it's quite obvious because of these two queries. I can't
understand why the estimated rows count is 40040 in the first plan.

test=# explain analyze select * from t2 join t1 on t1.t = t2.t where
t1.t in (2,3,4);
                                                   QUERY PLAN
------------------------------------------------------------------------------------------------------------------
Hash Join  (cost=1.09..2319.87 rows=40040 width=32) (actual
time=0.050..356.269 rows=400 loops=1)
  Hash Cond: (t2.t = t1.t)
  ->  Seq Scan on t2  (cost=0.00..1543.00 rows=100100 width=16) (actual
time=0.013..176.087 rows=100100 loops=1)
  ->  Hash  (cost=1.07..1.07 rows=2 width=16) (actual time=0.023..0.023
rows=3 loops=1)
        Buckets: 1024  Batches: 1  Memory Usage: 1kB
        ->  Seq Scan on t1  (cost=0.00..1.07 rows=2 width=16) (actual
time=0.006..0.014 rows=3 loops=1)
              Filter: (t = ANY ('{2,3,4}'::bigint[]))
Total runtime: 356.971 ms
(8 rows)

test=# explain analyze select * from t2 join t1 on t1.t = t2.t where
t1.t = 2 union all select * from t2 join t1 on t1.t = t2.t where t1.t =
3 union all select * from t2 join t1 on t1.t = t2.t where t1.t = 4;
                                                        QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Append  (cost=0.00..112.42 rows=407 width=32) (actual time=0.048..3.487
rows=400 loops=1)
  ->  Nested Loop  (cost=0.00..47.51 rows=197 width=32) (actual
time=0.045..1.061 rows=200 loops=1)
        ->  Seq Scan on t1  (cost=0.00..1.06 rows=1 width=16) (actual
time=0.011..0.014 rows=1 loops=1)
              Filter: (t = 2)
        ->  Index Scan using t_idx on t2  (cost=0.00..44.48 rows=197
width=16) (actual time=0.026..0.382 rows=200 loops=1)
              Index Cond: (pg_temp_2.t2.t = 2)
  ->  Nested Loop  (cost=0.00..32.67 rows=117 width=32) (actual
time=0.019..0.599 rows=100 loops=1)
        ->  Seq Scan on t1  (cost=0.00..1.06 rows=1 width=16) (actual
time=0.003..0.006 rows=1 loops=1)
              Filter: (t = 3)
        ->  Index Scan using t_idx on t2  (cost=0.00..30.43 rows=117
width=16) (actual time=0.010..0.211 rows=100 loops=1)
              Index Cond: (pg_temp_2.t2.t = 3)
  ->  Nested Loop  (cost=0.00..28.17 rows=93 width=32) (actual
time=0.017..0.534 rows=100 loops=1)
        ->  Seq Scan on t1  (cost=0.00..1.06 rows=1 width=16) (actual
time=0.005..0.008 rows=1 loops=1)
              Filter: (t = 4)
        ->  Index Scan using t_idx on t2  (cost=0.00..26.18 rows=93
width=16) (actual time=0.007..0.187 rows=100 loops=1)
              Index Cond: (pg_temp_2.t2.t = 4)
Total runtime: 4.190 ms
(17 rows)

--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance