Thanks Tom!Regret the delay in reply, but two of the three guesses were spot-on and resolved the doubt. 8.4.9 does take care of this case very well.
On 10/27/2011 01:27 AM, Tom Lane wrote:
I suspect that you're just fooling yourself here, and the "optimized" query is no such thing.
:) I actually meant 'faster' query, but well...
I believe you've nailed it pretty accurately. The tables are horrendously bloated and I may need to tune AutoVacuum to be much more aggressive than it is. I did see that HashAggregate makes only a minor difference, but what didn't strike is that the slowness could be bloat.1. The tables are horrendously bloated on the first database, so that many more pages have to be touched to get the same number of tuples. This would likely indicate an improper autovacuum configuration.
On hindsight, this was a miss. Should have warmed the caches before posting. Re-running this query multiple times, brought out the result in ~100ms.2. You failed to account for caching effects, ie the first example is being run "cold" and has to actually read everything from disk, whereas the second example has everything it needs already in RAM. In that case the speed differential is quite illusory.
Its just luck. Sometimes the corresponding values genuinely don't exist in the other table, so that's ok.BTW, how come is it that "SELECT large_table_b.field_b FROM large_table_b WHERE field_a = 2673056" produces no duplicate field_b values? Is that just luck? Is there a unique constraint on the table that implies it will happen?
-- Robins Tharakan
Attachment:
smime.p7s
Description: S/MIME Cryptographic Signature