Tom Lane wrote:
David Rysdam <drysdam@xxxxxxxxxx> writes:
merge join (cost=0.00..348650.65 rows=901849 width=12)
merge cond {blah}
join filter {blah}
index scan using {blah index on blah} (cost=0.00..289740.65
rows=11259514 width=8)
index scan using {blah index on blah} (cost=0.00..17229.93
rows=902085 width=8)
This query takes about 3 minutes to run and I'm trying to figure out
why. From a tutorial and the docs, I gather that the "..largenum" part
is the number of page reads required, so I understand where 289740 and
17229 come from. But what about 348650 page reads for the "merge
join"?
You're misreading it. An upper node's cost includes the cost of its
children. So the actual cost estimate for the join step is 41680.07.
When I do EXPLAIN ANALYZE, the actual values come out like this:
merge join: (actual time=170029.404..170029.404)
That seems a bit odd ... is there only one row produced? Could you show
us the entire EXPLAIN ANALYZE output, rather than your assumptions about
what's important?
Increasing work_mem won't help a merge join, but if you can get it large
enough to allow a hash join to be used instead, that might be a win.
regards, tom lane
I'm looking for certain anomalies, so the end result should be zero rows.
merge join (cost=0.00..348650.65 rows=901849 width=12) (actual
time=170029.404..170029.404 rows=0 loops=1)
merge cond {blah}
join filter {blah}
index scan using {blah index on blah} (cost=0.00..289740.65
rows=11259514 width=8) (actual time=29.227..85932.426 rows=11256725 loops=1)
index scan using {blah index on blah} (cost=0.00..17229.93 rows=902085
width=8) (actual time=39.896..6766.755 rows=902236 loops=1)
Total runtime: 172469.209 ms