"John D. Burger" <john@xxxxxxxxx> writes: > I have two queries for looking up related words which I think should > be equivalent, but 7.4.8 comes up with very different plans. They're not at all equivalent: > explain analyze select w2.word from allwords w1 join allwordrelations > as r on (w1.wordid = r.word1id) join allwords w2 on (w2.wordid = > r.word2id) where w1.word = 'dogging'; > explain analyze select w2.word from allwords w1, allwords w2 where > (w1.wordid, w2.wordid) in (select word1id, word2id from > allwordrelations ) and w1.word = 'dogging'; If there are duplicate word1id,word2id entries in allwordrelations, the first query will produce duplicate outputs; the second will not. If there were a unique constraint on (word1id, word2id), in theory the planner could prove that the IN form could be simplified to a plain join, but there is no such logic in HEAD let alone 7.4, and in any case you've not got such a constraint. The plan that gets chosen is to forcibly unique-ify the (word1id, word2id) data (via a "sort | uniq"-like pipeline) and then do a normal join with that. Which is expensive because allwordrelations is big. But the alternative is probably even worse: without that allwordrelations has to be joined to w1 and w2 simultaneously, meaning that the unconstrained cartesian product of w1 and w2 has to be formed first. regards, tom lane