Re: Speed difference between select ... union select ... and select from partitioned_table

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



These are the EXPLAIN ANALIZE:

I ran both queries on a CLUSTER and ANALYZEd tables:

UNION QUERY
explain analyze
select e, p,  sum( c) as c
from (
        select e, p, count( *) as c
        from tt_00003
        group by e, p
        union
        select e, p, count( *) as c
        from tt_00006
        group by e, p
        union
        select e, p, count( *) as c
        from tt_00009
        group by e, p
        union
        select e, p, count( *) as c
        from tt_00012
        group by e, p
        union
        select e, p, count( *) as c
        from tt_00015
        group by e, p
) as t
group by e, p
order by e, p desc;

"Sort  (cost=2549202.87..2549203.37 rows=200 width=16) (actual time=263593.182..263593.429 rows=207 loops=1)"
"  Sort Key: t.e, t.p"
"  ->  HashAggregate  (cost=2549192.73..2549195.23 rows=200 width=16) (actual time=263592.469..263592.763 rows=207 loops=1)"
"        ->  Unique  (cost=2549172.54..2549179.88 rows=734 width=8) (actual time=263590.481..263591.764 rows=356 loops=1)"
"              ->  Sort  (cost=2549172.54..2549174.38 rows=734 width=8) (actual time=263590.479..263590.891 rows=356 loops=1)"
"                    Sort Key: e, p, c"
"                    ->  Append  (cost=1307131.88..2549137.60 rows=734 width=8) (actual time=132862.176..263589.774 rows=356 loops=1)"
"                          ->  HashAggregate  (cost=1307131.88..1307133.03 rows=92 width=8) (actual time=132862.173..132862.483 rows=200 loops=1)"
"                                ->  Seq Scan on tt_00003  (cost=0.00..1081550.36 rows=30077536 width=8) (actual time=10.135..83957.424 rows=30000000 loops=1)"
"                          ->  HashAggregate  (cost=1241915.64..1241916.16 rows=42 width=8) (actual time=130726.219..130726.457 rows=156 loops=1)"
"                                ->  Seq Scan on tt_00006  (cost=0.00..1028793.22 rows=28416322 width=8) (actual time=11.389..87338.730 rows=28351293 loops=1)"
"                          ->  HashAggregate  (cost=24.53..27.03 rows=200 width=8) (actual time=0.005..0.005 rows=0 loops=1)"
"                                ->  Seq Scan on tt_00009  (cost=0.00..18.30 rows=830 width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"                          ->  HashAggregate  (cost=24.53..27.03 rows=200 width=8) (actual time=0.004..0.004 rows=0 loops=1)"
"                                ->  Seq Scan on tt_00012  (cost=0.00..18.30 rows=830 width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"                          ->  HashAggregate  (cost=24.53..27.03 rows=200 width=8) (actual time=0.005..0.005 rows=0 loops=1)"
"                                ->  Seq Scan on tt_00015  (cost=0.00..18.30 rows=830 width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"Total runtime: 263594.381 ms"


PARTITIONED QUERY

explain analyze
select e, p,  count( *) as c
from tt
group by e, p
order by e, p desc;

"GroupAggregate  (cost=13256958.67..13842471.95 rows=40000 width=8) (actual time=899391.384..1065585.531 rows=207 loops=1)"
"  ->  Sort  (cost=13256958.67..13403211.99 rows=58501328 width=8) (actual time=899391.364..989749.914 rows=58351293 loops=1)"
"        Sort Key: public.tt.e, public.tt.p"
"        ->  Append  (cost=0.00..2110508.28 rows=58501328 width=8) (actual time=14.031..485211.466 rows=58351293 loops=1)"
"              ->  Seq Scan on tt  (cost=0.00..18.30 rows=830 width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"              ->  Seq Scan on tt_00003 tt  (cost=0.00..1081550.36 rows=30077536 width=8) (actual time=14.024..178657.738 rows=30000000 loops=1)"
"              ->  Seq Scan on tt_00006 tt  (cost=0.00..1028793.22 rows=28416322 width=8) (actual time=39.852..168307.030 rows=28351293 loops=1)"
"              ->  Seq Scan on tt_00009 tt  (cost=0.00..18.30 rows=830 width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"              ->  Seq Scan on tt_00012 tt  (cost=0.00..18.30 rows=830 width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"              ->  Seq Scan on tt_00015 tt  (cost=0.00..18.30 rows=830 width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"              ->  Seq Scan on tt_00018 tt  (cost=0.00..18.30 rows=830 width=8) (actual time=0.001..0.001 rows=0 loops=1)"
"              ->  Seq Scan on tt_00021 tt  (cost=0.00..18.30 rows=830 width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"              ->  Seq Scan on tt_00024 tt  (cost=0.00..18.30 rows=830 width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"              ->  Seq Scan on tt_00027 tt  (cost=0.00..18.30 rows=830 width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"              ->  Seq Scan on tt_00030 tt  (cost=0.00..18.30 rows=830 width=8) (actual time=0.002..0.002 rows=0 loops=1)"
"Total runtime: 1066301.084 ms"

Any idea?

Regards

Pablo


Jeff Davis wrote:
On Fri, 2007-10-26 at 16:37 -0400, Pablo Alcaraz wrote:
  
Hi List!

I executed 2 equivalents queries. The first one uses a union structure. 
The second uses a partitioned table. The tables are the same with 30 
millions of rows each one and the returned rows are the same.

But the union query perform faster than the partitioned query.

    

I think you mean to use UNION ALL here. UNION forces a DISTINCT, which
results in a sort operation. What surprises me is that the UNION is
actually faster than the partitioning using inheritance.

I suspect it has something to do with the GROUP BYs, but we won't know
until you post EXPLAIN ANALYZE results.

    Regards,
	Jeff Davis


---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

  


[Postgresql General]     [Postgresql PHP]     [PHP Users]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Yosemite]

  Powered by Linux