On Sun, Feb 15, 2015 at 01:06:02PM +0100, Francisco Olarte wrote: > You state below 200k rows, 50k lines per path. That is not huge unless > "series" really big, is it? series data is in between of 100-4096 chars > 1.- Get the patches into a ( temp ) table, using something like \copy, call > this patches_in. > 2.- create (temp) table existing_out as select series, id from dictionary > join patches_in on (series); > 3.- delete from patches_in where series in (select series from > existing_out); > 4.- create (temp) table new_out as insert into dictionary (series) select > patches_in.series from patches_in returning series, id > 5.- Copy existing out and patches out. > 6.- Cleanup temps. That sounds cool, but I'm a bit worried about the performance of a lookup over the series column and the time to create index for the "temp" table on "series" column. But perhaps it's better to try this and if a performance will go really bad - then do some optimizations, like partitioning etc. Thank you! -- Eugene Dzhurinsky
Attachment:
pgprtEUvxDXzB.pgp
Description: PGP signature