I was recently asked to look into why one particular set of queries was taking a long time. The queries are all of the same form. They select the UNION of a few columns from around 100 tables. The query in particular was taking some 7-8 minutes to run. On a whim, I changed the query from this form: SELECT a, b FROM FOO_a WHERE <conditions> UNION SELECT a,b FROM FOO_b WHERE <conditions> .... to: SELECT DISTINCT a,b FROM FOO_a WHERE <conditions> UNION SELECT DISTINCT a,b FROM FOO_b WHERE <conditions> ... and the query time dropped to under a minute. In the former case, the query plan was a bitmap heap scan for each table. Then those results were Appended, Sorted, Uniqued, Sorted again, and then returned. In the latter, before Appending, each table's results were run through HashAggregate. The total number of result rows is in the 500K range. Each table holds approximately 150K matching rows (but this can vary a bit). What I'm asking is this: since adding DISTINCT to each participating member of the UNION query reduced the total number of appended rows, is there some sort of heuristic that postgresql could use to do this automatically? The 12x speedup was quite nice. -- Jon -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance