Scott Marlowe <scott.marlowe@xxxxxxxxx> writes: > If you're de-duping a whole table, no need to create indexes, as it's > gonna have to hit every row anyway. Fastest way I've found has been: > select a,b,c into newtable from oldtable group by a,b,c; > On pass, done. > If you want to use less than the whole row, you can use select > distinct on (col1, col2) * into newtable from oldtable; Also, the DISTINCT ON method can be refined to control which of a set of duplicate keys is retained, if you can identify additional columns that constitute a preference order for retaining/discarding dupes. See the "latest weather reports" example in the SELECT reference page. In any case, it's advisable to crank up work_mem while performing this operation. regards, tom lane -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general