Hi there, Below is a snippet of data of summary (24hours, ie ~86400seconds) and detail (~300seconds), that gets generated every day and then posted to a central server that needs to load these data into a (Preferably PostgreSQL) database. Now, my problem is that the majority of the previous data (except the first entry) is the same for the files of the different days, as the extraction is a tad brain dead/difficult to get done per previous day, and we prefer to get those extra data for in case. Now my problem is importing the data in bulk, and to keep the version of the record that has the longest interval value (the third field in the CSV below). Refer to the entries of 03/29 of the *.gs files. The *.gd have the same, however there would be a single entry that might be shorter, though we'll be adding a full day's worth of extra ~5min interval data. If I set a unique key on date,time,interval_length (or secondsSince1970, interval_length), I can reject the similar entries, but then I'm left with duplicates on date,time (or secondsSince1970) and would need to clean up with duplicate detection SQL code, and I'd prefer to prevent that at load time. Any advice/ideas as to the K.I.S.S. to use/implement insert/update instead of doing the select/delete search for duplicates? (Columns: Date, Time, secondSince1970, interval_length in seconds, performance data...) [visagehe@tsysl01 capacity]$ tail -n4 daily/dnba01-201203*gs ==> daily/dnba01-20120329.gs <== 03/26/2012,00:00:00,1332720000,86411.1, 4.10, 0.18, 53.36, 93, 03/27/2012,00:00:00,1332806400,86406.9, 4.23, 0.21, 53.63, 96, 03/28/2012,00:00:00,1332892800,86409.0, 4.15, 0.18, 53.94, 93, 03/29/2012,00:00:00,1332979200,57606.0, 4.17, 0.17, 54.19, 78, ==> daily/dnba01-20120330.gs <== 03/27/2012,00:00:00,1332806400,86406.9, 4.23, 0.21, 53.63, 96, 03/28/2012,00:00:00,1332892800,86409.0, 4.15, 0.18, 53.94, 93, 03/29/2012,00:00:00,1332979200,86409.5, 4.16, 0.20, 54.18, 93, 03/30/2012,00:00:00,1333065600, 300.4, 14.73, 5.22, 54.64, 79, [visagehe@tsysl01 capacity]$ [visagehe@tsysl01 capacity]$ tail -n4 daily/dnba01-201203*gd ==> daily/dnba01-20120329.gd <== 03/29/2012,15:40:00,1333035600, 300.1, 4.35, 0.06, 54.12, 77, 03/29/2012,15:45:00,1333035900, 300.0, 3.62, 0.18, 54.14, 80, 03/29/2012,15:50:00,1333036200, 300.0, 3.67, 0.16, 54.17, 79, 03/29/2012,15:55:00,1333036500, 299.0, 5.27, 0.39, 54.54, 82, ==> daily/dnba01-20120330.gd <== 03/29/2012,23:45:00,1333064700, 300.1, 3.60, 0.03, 54.29, 73, 03/29/2012,23:50:00,1333065000, 300.0, 3.71, 0.04, 54.30, 77, 03/29/2012,23:55:00,1333065300, 300.0, 5.17, 0.05, 54.68, 76, 03/30/2012,00:00:00,1333065600, 300.4, 14.73, 5.22, 54.64, 79, [visagehe@tsysl01 capacity]$ -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general