Re: hash aggregation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 11, 2012 at 9:14 PM, Korisk <Korisk@xxxxxxxxx> wrote:
> Strange situation.
> After indexscan enabling the cost is seriously decreased.

AFAIK when the planner has to choose between index scans and seq scans
and both of this options are off it uses one of this strategies anyway
but puts 10000000000.00 as a lower cost for this (thanks Maxim Boguk
for the explanation in chat).

>    ->  Index Only Scan using hashcheck_name_rev_idx on public.hashcheck  (cost=10000000000.00..10000466660.96 rows=25990002 width=32) (act
> ual time=0.166..3698.776 rows=25990002 loops=1)

So when you enabled one of these options it started using it as usual.

> hashes=# set enable_indexscan=on;
> SET
> hashes=# explain  analyse verbose select name, count(name) as cnt from  hashcheck group by name order by name desc;

[cut]

>    ->  Index Only Scan using hashcheck_name_rev_idx on public.hashcheck  (cost=0.00..466660.96 rows=25990002 width=32) (actual time=0.129.
> .3653.848 rows=25990002 loops=1)

What I can not understand is why the seq scan's estimated cost is
better the index scan's one. It depends on the number of pages in
index/relation. May be the index is heavily bloated?

Let's see the sizes:

select pg_total_relation_size('hashcheck')
select pg_total_relation_size('hashcheck_name_rev_idx');


> hashes=# set enable_seqscan=on;
> SET
> hashes=# explain  analyse verbose select name, count(name) as cnt from  hashcheck group by name order by name desc;
>                                                                   QUERY PLAN
>
> ------------------------------------------------------------------------------------------------------------------------------------------
> -----
>  Sort  (cost=565411.67..565412.17 rows=200 width=32) (actual time=21746.799..21747.026 rows=4001 loops=1)
>    Output: name, (count(name))
>    Sort Key: hashcheck.name
>    Sort Method: quicksort  Memory: 315kB
>    ->  HashAggregate  (cost=565402.03..565404.03 rows=200 width=32) (actual time=21731.551..21733.277 rows=4001 loops=1)
>          Output: name, count(name)
>          ->  Seq Scan on public.hashcheck  (cost=0.00..435452.02 rows=25990002 width=32) (actual time=29.431..13383.812 rows=25990002 loop
> s=1)
>                Output: id, name, value
>  Total runtime: 21747.356 ms
> (9 rows)
>
>
>
>
>



-- 
Sergey Konoplev

a database and software architect
http://www.linkedin.com/in/grayhemp

Jabber: gray.ru@xxxxxxxxx Skype: gray-hemp Phone: +14158679984


-- 
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux