On Thu, Jan 13, 2011 at 5:47 PM, Andy Colson <andy@xxxxxxxxxxxxxxx> wrote: >>>> I don't believe there is any case where hashing each individual relation >>>> is a win compared to hashing them all together. If the optimizer were >>>> smart enough to be considering the situation as a whole, it would always >>>> do the latter. >>> >>> You might be right, but I'm not sure. Suppose that there are 100 >>> inheritance children, and each has 10,000 distinct values, but none of >>> them are common between the tables. In that situation, de-duplicating >>> each individual table requires a hash table that can hold 10,000 >>> entries. But deduplicating everything at once requires a hash table >>> that can hold 1,000,000 entries. >>> >>> Or am I all wet? >> >> Yeah, I'm all wet, because you'd still have to re-de-duplicate at the >> end. But then why did the OP get a speedup? *scratches head* > > Because it all fix it memory and didnt swap to disk? Doesn't make sense. The re-de-duplication at the end should use the same amount of memory regardless of whether the individual relations have already been de-duplicated. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance