The data originated in a spreadsheet and, based on my experience, contains duplicate records. After reformatting there are 143,260 rows to insert in the table. The approach I tried seems to have problems (explained below) and I would like to learn the proper way to insert rows in either an empty table or one with existing rows since I'll need to do this procedure for my projects. The table was created with the primary key and I used INSERT INTO ... to load the data. Many duplicate records, so I split the file into smaller ones and re-ran the command to load them. I'd then remove the reported duplicate rows from the source (text) file. Between runs, I'd issue the DROP TABLE command within psql and check it was empty using 'select count(*) from waterchem;'. It appeared that I removed duplicates from the first couple of smaller files so I combined them into one file named ok.sql. But, when I tested the combined file it, too, reported many duplicate records. Something wrong here. Perhaps a better approach is to put the CREATE TABLE command above the INSERT INTO section of the file (without specifying a PK), load that using the command 'psql -d <database> -f waterchem.sql', then add the PK and remove duplicates as postgres presents them. Or, perhaps there is a much more efficient way to accomplish this task. Rather than my flailing around and spending a lot of time failing to load all non-duplicate rows into the table I'd like to learn the proper way to accomplish this task. Suggestions, recommendations, and your experiences are requested. TIA, Rich -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general