John McKown wrote > I don't > know, myself, why this would be faster. But I'm not any kind of a > PostgreSQL expert either. It is faster because PostgreSQL does not have native parallelism. By using a%n in a where clause you can start n separate sessions and choose a different value of n for each one and manually introduce parallelism into the activity. Though given this is going to likely be I/O constrained the possible gains do not scale lineally with the number of sessions - which themselves effectively max out at the number of cores available to the server. David J. -- View this message in context: http://postgresql.nabble.com/Re-Removing-duplicate-records-from-a-bulk-upload-rationale-behind-selecting-a-method-tp5829682p5830353.html Sent from the PostgreSQL - general mailing list archive at Nabble.com. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general