Re: vacuum analyze slows sql query

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sorry for the late reply.  Was feeling a bit under the weather
this weekend and didn't get a chance to look at this.


--- Tom Lane <tgl@xxxxxxxxxxxxx> wrote:

> patrick ~ <sidsrr@xxxxxxxxx> writes:
> >  PREPARE pkk_00 ( integer ) <the def of pkk_offer_has_pending_purc( integer
> )
> 
> This is what you want to do, but not quite like that.  The PREPARE
> determines the plan and so VACUUMing and re-EXECUTing is going to show
> the same plan.  What we need to look at is
> 	- standing start
> 	PREPARE pkk_00 ...
> 	EXPLAIN ANALYZE EXECUTE pkk_00 ...
> 	VACUUM ANALYZE;
> 	PREPARE pkk_01 ...
> 	EXPLAIN ANALYZE EXECUTE pkk_01 ...

But of course!  I feel a bit silly now.

This is what I get after following Tom's directions:

pkk=# prepare pkk_00 ( integer ) as select ...
PREPARE
Time: 1.753 ms
pkk=# execute pkk_00(    241 );
 case 
------
 f
(1 row)

Time: 0.788 ms
pkk=# explain analyze execute pkk_00(    241 );
      QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=10.73..10.74 rows=1 width=0) (actual time=0.067..0.068 rows=1
loops=1)
   InitPlan
     ->  Limit  (cost=0.00..10.73 rows=1 width=4) (actual time=0.055..0.055
rows=0 loops=1)
           ->  Index Scan using pur_offer_id_idx on pkk_purchase p0 
(cost=0.00..20690.18 rows=1929 width=4) (actual time=0.052..0.052 rows=0
loops=1)
                 Index Cond: (offer_id = $1)
                 Filter: ((((expire_time)::timestamp with time zone > now()) OR
(expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL) OR
(pending = true)))
 Total runtime: 0.213 ms
(7 rows)

Time: 24.654 ms
pkk=# vacuum analyze ;
VACUUM
Time: 128826.078 ms
pkk=# prepare pkk_01 ( integer ) as select ...
PREPARE
Time: 104.658 ms
pkk=# execute pkk_01(    241 );
 case 
------
 f
(1 row)

Time: 7652.708 ms
pkk=# explain analyze execute pkk_01(    241 );
      QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=2.66..2.67 rows=1 width=0) (actual time=2872.211..2872.213
rows=1 loops=1)
   InitPlan
     ->  Limit  (cost=0.00..2.66 rows=1 width=4) (actual
time=2872.189..2872.189 rows=0 loops=1)
           ->  Seq Scan on pkk_purchase p0  (cost=0.00..37225.83 rows=13983
width=4) (actual time=2872.180..2872.180 rows=0 loops=1)
                 Filter: ((offer_id = $1) AND (((expire_time)::timestamp with
time zone > now()) OR (expire_time IS NULL) OR (pending = true)) AND
((cancel_date IS NULL) OR (pending = true)))
 Total runtime: 2872.339 ms
(6 rows)

Time: 2873.479 ms


So it looks like after the VACCUM the planner resorts to Seq Scan
rather than Index Scan.

This is because of the value of correlation field in pg_stats
(according to PostgreSQL docs) being closer to 0 rather than
±1:

pkk=# select tablename,attname,correlation from pg_stats where tablename =
'pkk_purchase' and attname = 'offer_id' ;
  tablename   | attname  | correlation 
--------------+----------+-------------
 pkk_purchase | offer_id |    0.428598
(1 row)


So I started to experiment with ALTER TABLE SET STATISTICS
values to see which gets the correlation closer to ±1.  The
trend seems to indicat the higher the stat value is set it
pushes the correlation value closer to 0:

set statistics   correlation
----------------------------
         800     0.393108
         500     0.408137
         200     0.43197
          50     0.435211
           1     0.45758

And a subsequent PREPARE and EXPLAIN ANALYZE confirms that
the Planer reverts back to using the Index Scan after setting
stats to 1 (even though correlation value is still closer
to 0 than 1):

pkk=# explain analyze execute pkk_02(    241 );
                                                                               
 QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Result  (cost=2.95..2.96 rows=1 width=0) (actual time=0.068..0.069 rows=1
loops=1)
   InitPlan
     ->  Limit  (cost=0.00..2.95 rows=1 width=4) (actual time=0.056..0.056
rows=0 loops=1)
           ->  Index Scan using pur_offer_id_idx on pkk_purchase p0 
(cost=0.00..35810.51 rows=12119 width=4) (actual time=0.053..0.053 rows=0
loops=1)
                 Index Cond: (offer_id = $1)
                 Filter: ((((expire_time)::timestamp with time zone > now()) OR
(expire_time IS NULL) OR (pending = true)) AND ((cancel_date IS NULL) OR
(pending = true)))
 Total runtime: 0.200 ms
(7 rows)



So, is this the ultimate solution to this issue?

--patrick


		
__________________________________ 
Do you Yahoo!? 
Check out the new Yahoo! Front Page. 
www.yahoo.com 
 



[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux