On Thu, Jun 3, 2010 at 11:19 AM, Torsten Zühlsdorff <foo@xxxxxxxxxxxxxxxxxxx> wrote: > Scott Marlowe schrieb: >> >> On Tue, Jun 1, 2010 at 9:03 AM, Torsten Zühlsdorff >> <foo@xxxxxxxxxxxxxxxxxxx> wrote: >>> >>> Hello, >>> >>> i have a set of unique data which about 150.000.000 rows. Regullary i get >>> a >>> list of data, which contains multiple times of rows than the already >>> stored >>> one. Often around 2.000.000.000 rows. Within this rows are many >>> duplicates >>> and often the set of already stored data. >>> I want to store just every entry, which is not within the already stored >>> one. Also i do not want to store duplicates. Example: >> >> The standard method in pgsql is to load the data into a temp table >> then insert where not exists in old table. > > Sorry, i didn't get it. I've googled some examples, but no one match at my > case. Every example i found was a single insert which should be done or > ignored, if the row is already stored. > > But in my case i have a bulk of rows with duplicates. Either your tipp > doesn't match my case or i didn't unterstand it correctly. Can you provide a > simple example? create table main (id int primary key, info text); create table loader (id int, info text); insert into main values (1,'abc'),(2,'def'),(3,'ghi'); insert into loader values (1,'abc'),(4,'xyz'); select * from main; id | info ----+------ 1 | abc 2 | def 3 | ghi (3 rows) select * from loader; id | info ----+------ 1 | abc 4 | xyz (2 rows) insert into main select * from loader except select * from main; select * from main; id | info ----+------ 1 | abc 2 | def 3 | ghi 4 | xyz (4 rows) Note that for the where not exists to work the fields would need to be all the same, or you'd need a more complex query. If the info field here was different you'd get an error an no insert / update. For that case you might want to use "where not in": insert into main select * from loader where id not in (select id from main); If you wanted the new rows to update pre-existing rows, then you could run an update first where the ids matched, then the insert where no id matches. -- Sent via pgsql-performance mailing list (pgsql-performance@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-performance