Tom Lane wrote:
2. transform joins into subselects, then return subselect rows via an
index bitmap. Joins are performed via a bitmap addition process.
Looks like 8.1 pretty much does this right now:
First the basic star:
EXPLAIN ANALYZE
SELECT
d0.dmth,
d1.dat,
count(f.fval )
FROM
dim0 AS d0,
dim1 AS d1,
fact0 AS f
WHERE d0.d0key = f.d0key
AND d1.d1key = f.d1key
AND d0.dyr BETWEEN 2010 AND 2015
AND d1.dattyp BETWEEN '10th measure type' AND '14th measure type'
GROUP BY
d0.dmth,
d1.dat
;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=334842.41..334846.53 rows=329 width=37) (actual
time=144317.960..144318.814 rows=120 loops=1)
-> Hash Join (cost=145.72..334636.91 rows=27400 width=37) (actual
time=1586.363..142831.025 rows=201600 loops=1)
Hash Cond: ("outer".d0key = "inner".d0key)
-> Hash Join (cost=89.72..333279.41 rows=137001 width=37)
(actual time=1467.322..135585.317 rows=1000000 loops=1)
Hash Cond: ("outer".d1key = "inner".d1key)
-> Seq Scan on fact0 f (cost=0.00..281819.45
rows=10000045 width=12) (actual time=120.881..70364.473 rows=10000000
loops=1)
-> Hash (cost=89.38..89.38 rows=137 width=33) (actual
time=24.822..24.822 rows=660 loops=1)
-> Index Scan using dim1_dattyp on dim1 d1
(cost=0.00..89.38 rows=137 width=33) (actual time=0.502..19.374 rows=660
loops=1)
Index Cond: (((dattyp)::text >= '10th
measure type'::text) AND ((dattyp)::text <= '14th measure type'::text))
-> Hash (cost=51.00..51.00 rows=2000 width=8) (actual
time=31.620..31.620 rows=2016 loops=1)
-> Index Scan using dim0_dyr on dim0 d0
(cost=0.00..51.00 rows=2000 width=8) (actual time=0.379..17.377
rows=2016 loops=1)
Index Cond: ((dyr >= 2010) AND (dyr <= 2015))
Total runtime: 144320.588 ms
(13 rows)
Now apply the star transformation:
EXPLAIN ANALYZE
SELECT
d0.dmth,
d1.dat,
count(f.fval )
FROM
dim0 AS d0,
dim1 AS d1,
fact0 AS f
WHERE d0.d0key = f.d0key
AND d1.d1key = f.d1key
AND d0.dyr BETWEEN 2010 AND 2015
AND d1.dattyp BETWEEN '10th measure type' AND '14th measure type'
AND f.d0key IN (SELECT cd0.d0key FROM dim0 cd0
WHERE cd0.dyr BETWEEN 2010 AND 2015)
AND f.d1key IN (SELECT cd1.d1key FROM dim1 cd1
WHERE cd1.dattyp BETWEEN '10th measure type'
AND '14th measure type')
GROUP BY
d0.dmth,
d1.dat
;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=129230.89..129231.83 rows=75 width=37) (actual
time=39798.192..39799.015 rows=120 loops=1)
-> Nested Loop IN Join (cost=149.44..129230.33 rows=75 width=37)
(actual time=269.919..38125.520 rows=201600 loops=1)
-> Hash Join (cost=147.43..128171.03 rows=375 width=45)
(actual time=269.516..27342.866 rows=201600 loops=1)
Hash Cond: ("outer".d0key = "inner".d0key)
-> Nested Loop (cost=91.43..128096.03 rows=2000
width=37) (actual time=152.084..19869.365 rows=1000000 loops=1)
-> Hash Join (cost=91.43..181.52 rows=2
width=37) (actual time=29.931..46.339 rows=660 loops=1)
Hash Cond: ("outer".d1key = "inner".d1key)
-> Index Scan using dim1_dattyp on dim1 d1
(cost=0.00..89.38 rows=137 width=33) (actual time=0.516..7.683
rows=660 loops=1)
Index Cond: (((dattyp)::text >= '10th
measure type'::text) AND ((dattyp)::text <= '14th measure type'::text))
-> Hash (cost=91.09..91.09 rows=137
width=4) (actual time=29.238..29.238 rows=660 loops=1)
-> HashAggregate (cost=89.72..91.09
rows=137 width=4) (actual time=20.940..24.900 rows=660 loops=1)
-> Index Scan using dim1_dattyp
on dim1 cd1 (cost=0.00..89.38 rows=137 width=4) (actual
time=0.042..14.841 rows=660 loops=1)
Index Cond:
(((dattyp)::text >= '10th measure type'::text) AND ((dattyp)::text <=
'14th measure type'::text))
-> Index Scan using fact0_d1key on fact0 f
(cost=0.00..62707.26 rows=100000 width=12) (actual time=0.205..12.691
rows=1515 loops=660)
Index Cond: ("outer".d1key = f.d1key)
-> Hash (cost=51.00..51.00 rows=2000 width=8) (actual
time=31.264..31.264 rows=2016 loops=1)
-> Index Scan using dim0_dyr on dim0 d0
(cost=0.00..51.00 rows=2000 width=8) (actual time=0.339..16.885
rows=2016 loops=1)
Index Cond: ((dyr >= 2010) AND (dyr <= 2015))
-> Bitmap Heap Scan on dim0 cd0 (cost=2.00..2.81 rows=1
width=4) (actual time=0.031..0.031 rows=1 loops=201600)
Recheck Cond: ("outer".d0key = cd0.d0key)
Filter: ((dyr >= 2010) AND (dyr <= 2015))
-> Bitmap Index Scan on dim0_d0key (cost=0.00..2.00
rows=1 width=0) (actual time=0.015..0.015 rows=1 loops=201600)
Index Cond: ("outer".d0key = cd0.d0key)
Total runtime: 39800.294 ms
(24 rows)
The real run times are more like 24s and 9s, but you get the idea.
Cheers
Mark