Hi,
I was tuning a join on a few tables for a SELECT COUNT(*) query, when I
realized that the content of the second table didn't actually matter to
the count. So, I figured a LEFT OUTER JOIN would be faster, but...
apparently it's not.
Shouldn't the planner notice that the right part of SELECT COUNT(*) FROM
x LEFT OUTER JOIN y USING (id) isn't relevant for the result and skip
the join?
Attached are the explain plans for both versions and a few variations
(TB wraps it if I paste).
Regards,
--
Alban Hertroys
alban@xxxxxxxxxxxxxxxxx
magproductions b.v.
T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede
// Integrate Your World //
explain analyze select count(*) from mm_medical_care_container_table JOIN mm_object USING (number);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=73149.73..73149.73 rows=1 width=0) (actual time=2780.516..2780.517 rows=1 loops=1)
-> Hash Join (cost=2025.33..72927.66 rows=88826 width=0) (actual time=406.166..2774.618 rows=5936 loops=1)
Hash Cond: ("outer".number = "inner".number)
-> Seq Scan on mm_object (cost=0.00..28666.63 rows=1651963 width=4) (actual time=0.015..1074.074 rows=892597 loops=1)
-> Hash (cost=1369.26..1369.26 rows=88826 width=4) (actual time=22.724..22.724 rows=0 loops=1)
-> Seq Scan on mm_medical_care_container_table (cost=0.00..1369.26 rows=88826 width=4) (actual time=0.028..13.889 rows=5936 loops=1)
Total runtime: 2780.599 ms
explain analyze select count(*) from mm_medical_care_container_table LEFT OUTER JOIN mm_object USING (number);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=95099.40..95099.40 rows=1 width=0) (actual time=2919.999..2920.000 rows=1 loops=1)
-> Hash Left Join (cost=40863.54..94877.34 rows=88826 width=0) (actual time=2283.399..2913.955 rows=5936 loops=1)
Hash Cond: ("outer".number = "inner".number)
-> Seq Scan on mm_medical_care_container_table (cost=0.00..1369.26 rows=88826 width=4) (actual time=0.021..12.450 rows=5936 loops=1)
-> Hash (cost=28666.63..28666.63 rows=1651963 width=4) (actual time=2263.407..2263.407 rows=0 loops=1)
-> Seq Scan on mm_object (cost=0.00..28666.63 rows=1651963 width=4) (actual time=0.022..1067.016 rows=892597 loops=1)
Total runtime: 2920.260 ms
I figured maybe the '*' was the culprit, so I tried this as well:
explain analyze select count(mm_medical_care_container_table.*) from mm_medical_care_container_table LEFT OUTER JOIN mm_object USING (number);
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=95099.40..95099.40 rows=1 width=4) (actual time=2984.603..2984.603 rows=1 loops=1)
-> Hash Left Join (cost=40863.54..94877.34 rows=88826 width=4) (actual time=2318.856..2978.470 rows=5936 loops=1)
Hash Cond: ("outer".number = "inner".number)
-> Seq Scan on mm_medical_care_container_table (cost=0.00..1369.26 rows=88826 width=8) (actual time=0.024..16.904 rows=5936 loops=1)
-> Hash (cost=28666.63..28666.63 rows=1651963 width=4) (actual time=2295.350..2295.350 rows=0 loops=1)
-> Seq Scan on mm_object (cost=0.00..28666.63 rows=1651963 width=4) (actual time=0.018..1074.118 rows=892597 loops=1)
Total runtime: 2985.101 ms
And here's what I expected to happen:
explain analyze select count(*) from mm_medical_care_container_table;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=1591.33..1591.33 rows=1 width=0) (actual time=17.427..17.428 rows=1 loops=1)
-> Seq Scan on mm_medical_care_container_table (cost=0.00..1369.26 rows=88826 width=0) (actual time=0.021..12.186 rows=5936 loops=1)
Total runtime: 17.482 ms