Roger Pack <rogerdpack2@xxxxxxxxx> writes: > As a note, I ran into the following today (doing a select distinct is fast, > doing a count distinct is significantly slower?) The planner appears to prefer hash aggregation for the variants of your query wherein the DISTINCT becomes a separate plan step. This is evidently a good choice, with only 6192 distinct values (hence just that many hash table entries) in 7495551 input rows. However, COUNT(DISTINCT), or any other aggregate with a DISTINCT tag, uses sort-then-remove-adjacent-duplicates logic for DISTINCT. That's evidently a good deal slower for your data set; most likely the data doesn't fit in your work_mem setting so the sort spills to disk. The reason DISTINCT aggregates don't consider hash aggregation is partly lack of round tuits but mostly that an aggregate needs to execute in a fairly limited amount of memory, and we can't be sure that the hash table wouldn't get unreasonably large. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general