On Thu, Jan 13, 2011 at 5:41 PM, Robert Haas <robertmhaas@xxxxxxxxx> wrote: > On Thu, Jan 13, 2011 at 5:26 PM, Tom Lane <tgl@xxxxxxxxxxxxx> wrote: >> Robert Haas <robertmhaas@xxxxxxxxx> writes: >>> On Thu, Jan 13, 2011 at 3:12 PM, Jon Nelson <jnelson+pgsql@xxxxxxxxxxx> wrote: >>>> I still think that having UNION do de-duplication of each contributory >>>> relation is a beneficial thing to consider -- especially if postgresql >>>> thinks the uniqueness is not very high. >> >>> This might be worth a TODO. >> >> I don't believe there is any case where hashing each individual relation >> is a win compared to hashing them all together. If the optimizer were >> smart enough to be considering the situation as a whole, it would always >> do the latter. > > You might be right, but I'm not sure. Suppose that there are 100 > inheritance children, and each has 10,000 distinct values, but none of > them are common between the tables. In that situation, de-duplicating > each individual table requires a hash table that can hold 10,000 > entries. But deduplicating everything at once requires a hash table > that can hold 1,000,000 entries. > > Or am I all wet? Yeah, I'm all wet, because you'd still have to re-de-duplicate at the end. But then why did the OP get a speedup? *scratches head* -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance