14.01.11 00:26, Tom Lane ÐÐÐÐÑÐÐ(ÐÐ):
Robert Haas<robertmhaas@xxxxxxxxx> writes:
On Thu, Jan 13, 2011 at 3:12 PM, Jon Nelson<jnelson+pgsql@xxxxxxxxxxx> wrote:
I still think that having UNION do de-duplication of each contributory
relation is a beneficial thing to consider -- especially if postgresql
thinks the uniqueness is not very high.
This might be worth a TODO.
I don't believe there is any case where hashing each individual relation
is a win compared to hashing them all together. If the optimizer were
smart enough to be considering the situation as a whole, it would always
do the latter.
How about cases when individual relations are already sorted? This will
mean that they can be deduplicated fast and in streaming manner. Even
partial sort order may help because you will need to deduplicate only
groups with equal sorted fields, and this will take much less memory and
be much more streaming. And if all individual deduplications are
streaming and are sorted in one way - you can simply do a merge on top.
Best regards, Vitalii Tymchyshyn.
--
Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance