Damn - so the unqiue contraint is still an issue. What gives? Why is
it so hard to implement this in Postgresql? sigh - if only I had more time.
Ow Mun Heng wrote:
On Wed, 2007-11-14 at 00:02 -0800, Willem Buitendyk wrote:
Perfect - that appears to be exactly what I was looking for.
Reg Me Please wrote:
Il Wednesday 14 November 2007 05:50:36 Willem Buitendyk ha scritto:
Will Postgresql ever implement an ignore on error feature when bulk
loading data? Currently it is my understanding that any record that
violates a unique constraint will cause the "copy from" command to halt
execution instead of skipping over the violation and logging it - as is
done in Oracle and DB2.
pgloader
http://pgfoundry.org/projects/pgloader/
I believe the last time I tried this, there was still some issues with
it. See attached email. (if it makes it to the list)
__________ NOD32 2657 (20071114) Information __________
This message was checked by NOD32 antivirus system.
http://www.eset.com
------------------------------------------------------------------------
Subject:
PgLoader unable to handle pkey dups Was [Re: {Spam} pgloader
- Can't find textreader/csvreader]
From:
Ow Mun Heng <Ow.Mun.Heng@xxxxxxx>
Date:
Mon, 27 Aug 2007 18:01:54 +0800
To:
Dimitri Fontaine <dfontaine@xxxxxxxxxxxx>
To:
Dimitri Fontaine <dfontaine@xxxxxxxxxxxx>
CC:
pgsql-general@xxxxxxxxxxxxxx
On Mon, 2007-08-27 at 11:27 +0200, Dimitri Fontaine wrote:
We've just made some tests here with 2.2.1 and as this release contains the
missing files, it works fine without any installation.
Yep.. I can confirm that it works.. I am using the csv example.
Goal : similar functionality much like mysql's mysqlimport --replace
(overwrite any rows which has duplicate primary keys)
$ psql pgloader < csv/csv.sql
$ ../pgloader.py -Tvc examples/pgloader.conf csv
pgloader=# alter table csv add primary key (a,b,c);
pgloader=# \d csv
Table "public.csv"
Column | Type | Modifiers
--------+--------------+-----------
a | bigint | not null
b | bigint | not null
c | character(2) | not null
d | text |
Indexes:
"csv_pkey" PRIMARY KEY, btree (a, b, c)
pgloader=# select * from csv;
a | b | c | d
----------+----------+----+----------------
33996344 | 33996351 | GB | United Kingdom
50331648 | 68257567 | US | United States
68257568 | 68257599 | CA | Canada
68257600 | 68259583 | US | United States
68259584 | 68259599 | CA | Canada
$cat csv/csv.data
"2.6.190.56","2.6.190.63","33996344","33996351","GB","Error Kingdom"
"4.17.143.0","4.17.143.15","68259584","68259599","CA","new Country"
<Note : only columns 3 to 6 are taken for loading)
$ psql pgloader < csv/csv.sql
$ ../pgloader.py -vc pgloader.conf csv
Using pgloader.conf configuration file
Will consider following sections:
csv
[csv] parse configuration
Notice: reject log in /tmp/csv.rej.log
Notice: rejected data in /tmp/csv.rej
[csv] data import
Notice: COPY csv data
Error: Please check PostgreSQL logs
HINT: double check your client_encoding, datestyle and copy_delimiter
settings
$sudo tail -f /var/log/pglog/postgresxx-xx-xx.log
ERROR: duplicate key violates unique constraint "csv_pkey"
CONTEXT: COPY csv, line 1: "33996344,33996351,Error Kingdom,GB"
STATEMENT: COPY csv (a, b, d, c) FROM stdin USING DELIMITERS ','
So.. doesn't really solve my issue.
Dang it..
---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings
---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to
choose an index scan if your joining column's datatypes do not
match