On Feb 11, 2013, at 2:23, Tim Uckun <timuckun@xxxxxxxxx> wrote: > This works pretty good except for when the top 100 records have > duplicated email address (two sales for the same email address). > > I am wondering what the best strategy is for dealing with this > scenario. Doing the records one at a time would work but obviously it > would be much slower. There are no other columns I can rely on to > make the record more unique either. The best strategy is fixing your data-model so that you have a unique key. As you found out already, e-mail addresses aren't very suitable as unique keys for people. For this particular case I'd suggest adding a surrogate key. Alternatively, you might try using (first_name, email) as your key. You'll probably still get some duplicates, but they should be less and perhaps few enough for your case. Alban Hertroys -- If you can't see the forest for the trees, cut the trees and you'll find there is no forest. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general