Re: select query takes 13 seconds to run with index

mark <markkicks@xxxxxxxxx> · Mon, 26 May 2008 20:12:26 -0700



On Mon, May 26, 2008 at 7:58 PM, mark <markkicks@xxxxxxxxx> wrote:
> On Mon, May 26, 2008 at 5:36 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote:
>> mark <markkicks@xxxxxxxxx> writes:
>>> EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
>>> DESC limit 6;
>>>                                                                      QUERY PLAN
>>> ----------------------------------------------------------------------------------------------------------------------------------------------------
>>>  Limit  (cost=0.00..9329.02 rows=6 width=135) (actual
>>> time=13612.247..13612.247 rows=0 loops=1)
>>>    ->  Index Scan Backward using pokes_pkey on pokes
>>> (cost=0.00..5182270.69 rows=3333 width=135) (actual
>>> time=13612.245..13612.245 rows=0 loops=1)
>>>          Filter: (uid = 578439028)
>>>  Total runtime: 13612.369 ms
>>> (4 rows)
>>
>> The problem is the vast disconnect between the estimated and actual
>> rowcounts for the indexscan (3333 vs 0).  The planner thinks there
>> are three thousand rows matching uid = 578439028, and that encourages
>> it to try a plan that's only going to be fast if at least six such
>> rows show up fairly soon while scanning the index in reverse id order.
>> What you really want it to do here is scan on the uid index and then
>> sort the result by id ... but that will be slow in exactly the case
>> where this plan is fast, ie, when there are a lot of matching uids.
>>
>> Bottom line: the planner cannot make the right choice between these
>> alternatives unless it's got decent statistics about the frequency
>> of uid values.  "I analyzed the table about a week ago" is not good
>> enough maintenance policy --- you need current stats, and you might need
>> to bump up the statistics target to get enough data about less-common
>> values of uid.
> how do i do this? bump up the statistics target?
>
>> (Since it's 8.3, the autovac daemon might have been analyzing for you,
>> if you didn't turn off autovacuum.  In that case increasing the
>> statistics target is the first thing to try.)
> i did not turn it off..
> and my OS is fedora 9
>
> i ran vacuum verbose analyze pokes, and then ran the same query, and
> there is no improvement..
>
> EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id  limit 6;
>                                                                QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------
>  Limit  (cost=0.00..8446.80 rows=6 width=130) (actual
> time=12262.779..12262.779 rows=0 loops=1)
>   ->  Index Scan using pokes_pkey on pokes  (cost=0.00..5149730.49
> rows=3658 width=130) (actual time=12262.777..12262.777 rows=0 loops=1)
>         Filter: (uid = 578439028)
>  Total runtime: 12262.817 ms

OK I did this

ALTER TABLE pokes ALTER uid set statistics 500;
ALTER TABLE

ANALYZE pokes;
ANALYZE

and then it became super fast!! thanks a lot!!!
my question:
-> is 500 too high? what all does this affect?
-> now increasing this number does it affect only when i am running
analyze commands, or will it slow down inserts and other operations?
EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
desc limit 6;
                                                              QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=467.80..467.81 rows=6 width=134) (actual
time=0.016..0.016 rows=0 loops=1)
   ->  Sort  (cost=467.80..468.09 rows=117 width=134) (actual
time=0.016..0.016 rows=0 loops=1)
         Sort Key: id
         Sort Method:  quicksort  Memory: 25kB
         ->  Index Scan using idx_pokes_uid on pokes
(cost=0.00..465.70 rows=117 width=134) (actual time=0.011..0.011
rows=0 loops=1)
               Index Cond: (uid = 578439028)
 Total runtime: 0.037 ms