Hi Pavel, See output of explain (analyze,timing off), the total runtime is close to the one enable timing. dev=# EXPLAIN (ANALYZE, TIMING OFF) select cha_type,sum(visits) from weekly_non_hstore a join seg1 b on a.ref_id=b.ref_id group by cha_type order by sum(visits)
desc; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=3674118.09..3674476.91 rows=143528 width=27) (actual rows=3639539 loops=1) Sort Key: (sum(a.visits)) Sort Method: quicksort Memory: 391723kB -> HashAggregate (cost=3660388.94..3661824.22 rows=143528 width=27) (actual rows=3639539 loops=1) -> Hash Join (cost=12029.58..3301288.46 rows=71820096 width=27) (actual rows=36962761 loops=1) Hash Cond: ((a.ref_id)::text = (b.ref_id)::text) -> Seq Scan on weekly_non_hstore a (cost=0.00..1852856.96 rows=71820096 width=75) (actual rows=71818882 loops=1) -> Hash (cost=7382.59..7382.59 rows=371759 width=47) (actual rows=371759 loops=1) Buckets: 65536 Batches: 1 Memory Usage: 28951kB -> Seq Scan on seg1 b (cost=0.00..7382.59 rows=371759 width=47) (actual rows=371759 loops=1) Total runtime: 42914.194 ms (11 rows) dev=# explain (analyze, timing off) select cha_type, sum(visits) from (select (each(visits)).key as cha_type,(each(visits)).value::numeric as visits from weekly_hstore
a join seg1 b on a.ref_id=b.ref_id )foo group by cha_type order by sum(visits) desc; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Sort (cost=7599039.89..7599040.39 rows=200 width=64) (actual rows=3639539 loops=1) Sort Key: (sum((((each(a.visits)).value)::numeric))) Sort Method: quicksort Memory: 394779kB -> HashAggregate (cost=7599030.24..7599032.24 rows=200 width=64) (actual rows=3639539 loops=1) -> Hash Join (cost=12029.58..2022645.24 rows=371759000 width=186) (actual rows=36962761 loops=1) Hash Cond: ((a.ref_id)::text = (b.ref_id)::text) -> Seq Scan on weekly_hstore a (cost=0.00..133321.14 rows=1292314 width=232) (actual rows=1292314 loops=1) -> Hash (cost=7382.59..7382.59 rows=371759 width=47) (actual rows=371759 loops=1) Buckets: 65536 Batches: 1 Memory Usage: 28951kB -> Seq Scan on seg1 b (cost=0.00..7382.59 rows=371759 width=47) (actual rows=371759 loops=1) Total runtime: 69521.570 ms (11 rows) Thanks, Suya From: Pavel Stehule [mailto:pavel.stehule@xxxxxxxxx]
2014-09-01 8:54 GMT+02:00 Huang, Suya <Suya.Huang@xxxxxxxxxxxxxxx>: Thank you Pavel. The cost of unpacking hstore comparing to non-hstore could be calculated by: Seq scan on hstore table + hash join with seg1 table: Hstore: 416.741+ 34619.879 =~34 seconds Non-hstore: 8858.594 +26477.652 =~ 34 seconds The subsequent hash-aggregate and sort operation should be working on the unpacked hstore rows which
has same row counts as non-hstore table. however, timing on those operations actually makes the big difference.
These values can be messy -- timing in EXPLAIN ANALYZE has relative big impact but different for some methods try to watch complete time for EXPLAIN (ANALYZE, TIMING OFF)
|