Search Postgresql Archives

Re: Check for duplicates before inserting new rows

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 04/09/2020 15:46, Rich Shepard wrote:
On Fri, 4 Sep 2020, Olivier Gautherot wrote:

First of all, what version of PostgreSQL are you using?

Olivier,

12.2.

One way would be to add a UNIQUE constraint and perform for each row of the
source table a INSERT ... ON CONFLICT DO NOTHING (see
https://www.postgresql.org/docs/current/sql-insert.html#SQL-ON-CONFLICT)

If it is a 2-way merge, I would encapsulate the process in a function
(which will create a transaction to protect your process) and add a column
to trace the rows that have been merged. For this purpose, you can use the
ON CONFLICT DO UPDATE ...

Lots for me to ponder.

The new data has attributes stored in two database tables: location and
measurements. I'll need to think about how both could be handled in a single
function. For example, the new data contains a row with a location not
already in the existing location table. That location needs to be added to
the location table and its associated measurement attributes then added to
the existing measurements table.

Thanks for the suggestion,

Rich



Assuming some simple table structures (I've not included PK or FK definitions for simplicity):

create table station (station_id integer, station_data text);

create table measurement (measurement_id bigserial, station_id integer, ameasurement text);

-- Add some test stations
insert into station(station_id, station_data)
values (1, 'station1'),
 (2, 'station2'),
 (3, 'station3') ;


*Query to identify new measurements:
select station_id, ameasurement
from (VALUES (1,'1meas1'), -- this represents your new test data set
             (1,'1meas2'),
             (2,'2meas1'),
             (3,'3meas1')) as m(station_id, ameasurement)
except
select station_id, ameasurement
from measurement;

The query above will give you a list of the new values which are not currently stored in table measurement.
Run it and we get all 4 rows returned.

Add a measurement row:
insert into measurement(station_id, ameasurement)
values (2,'2meas1');

Now if you repeat the check for new measurements with the same query as at *, you only get 3 rows.

Cheers, Chris Sterritt

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux