Re: How to idenity duplicate rows

Berend Tober <btober@xxxxxxxxxxxxxxxx> · Sun, 19 Mar 2006 21:01:42 -0500

Peter Eisentraut wrote:

David Inglis wrote:

Can anybody assist with this problem I have a table that has some
duplicated rows of data,  I want to place a unique constraint on  the
columns userid and procno to eliminate this problem in the future but
how do I identify and get rid of the existing duplication.

To find them, something like

SELECT a, b, c FROM table GROUP BY a, b, c HAVING count(*) > 1;

comes to mind, where you have to list all columns of the table in place 
of a, b, c.

As for deleting all but one row in a duplicated group, you're going to 
have to get at them by the oid or ctid columns perhaps.

The other idea is to run CREATE TABLE newtable AS SELECT DISTINCT * FROM 
oldtable;.

This doesn't bring over to the new table any foreign key relationships 
or triggers.

Another approach (if you don't have OID's) is to create uniqueness by 
appending a column to the table, populating it with sequential integers. 
Then you proceed as otherwise suggested above by using aggregation to 
identify the duplicated rows.