Tom Lane wrote:
Alban Hertroys <alban@xxxxxxxxxxxxxxxxx> writes:
zorgweb_solaris=> select * from pg_stats where attname = 'number' and
tablename IN ('mm_insrel_table', 'mm_product_table', 'mm_object');
tablename | mm_product_table
histogram_bounds | {2930,3244,3558,3872,4186,4500,4814,5128,5442,5756,6070}
tablename | mm_insrel_table
{615920,689286,750855,812003,872741,933041,1004672,1068250,1134894,1198559,1261685}
tablename | mm_object
histogram_bounds |
{287,124412,256534,375896,505810,643940,770327,899229,1028933,1153260,1262338}
OK, so here's our problem: according to those stats, the ranges of
"number" in mm_product_table and mm_insrel_table don't overlap at all.
That's correct, the numbers are generated by a global sequence.
Insrel.number can never match a product.number.
However, mm_product.number always matches either mm_insrel.snumber or
mm_insrel.dnumber (source and destination respectively). The other way
around this isn't the case; then snumber and dnumber match number-fields
in other tables (they always do).
So the cost model for mergejoin predicts that a mergejoin on "number"
will have to read all of mm_product_table but only the first record from
mm_insrel_table, and given the difference in size of the two tables,
that looks like a pretty good deal.
Given that the plan is not actually very fast, I suppose that the
histogram is not telling the whole truth --- probably there are a few
outlying records in one table or the other causing there to be a more
significant overlap than the planner expects. If so, you can probably
fix it by increasing the statistics target for that table.
That's a bit odd, as the number fields of the different tables are
globally unique by definition.
regards, tom lane
Regards,
--
Alban Hertroys
alban@xxxxxxxxxxxxxxxxx
magproductions b.v.
T: ++31(0)534346874
F: ++31(0)534346876
M:
I: www.magproductions.nl
A: Postbus 416
7500 AK Enschede
// Integrate Your World //