On Fri, Nov 11, 2011 at 7:36 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: > Ruslan Zakirov <ruz@xxxxxxxxxxxxxxxxx> writes: >> A table has two columns id and EffectiveId. First is primary key. >> EffectiveId is almost always equal to id (95%) unless records are >> merged. Many queries have id = EffectiveId condition. Both columns are >> very distinct and Pg reasonably decides that condition has very low >> selectivity and picks sequence scan. > > I think the only way is to rethink your data representation. PG doesn't > have cross-column statistics at all, and even if it did, you'd be asking > for an estimate of conditions in the "long tail" of the distribution. > That's unlikely to be very accurate. Rethinking schema is an option that requires more considerations as we do it this way for years and run product on mysql, Pg and Oracle. Issue affects Oracle, but it can be worked around by dropping indexes or may be by building correlation statistics in 11g (didn't try it yet). Wonder if "CROSS COLUMN STATISTICS" patch that floats around would help with such case? > Consider adding a "merged" boolean, or defining effectiveid differently. > For instance you could set it to null in unmerged records; then you > could get the equivalent of the current meaning with > COALESCE(effectiveid, id). In either case, PG would then have > statistics that bear directly on the question of how many merged vs > unmerged records there are. NULL in EffectiveId is the way to go, however when we actually need those records (not so often situation) query becomes frightening: SELECT main.* FROM Tickets main JOIN Tickets te ON te.EffectiveId = main.id OR (te.id = main.id AND te.EffectiveId IS NULL) JOIN OtherTable ot ON ot.Ticket = te.id Past experience reminds that joins with ORs poorly handled by many optimizers. In the current situation join condition is very straightforward and effective. > regards, tom lane -- Best regards, Ruslan. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance