Search Postgresql Archives

Re: Bulk Load Ignore/Skip Feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2007-11-14 at 00:02 -0800, Willem Buitendyk wrote:
> Perfect - that appears to be exactly what I was looking for.

> Reg Me Please wrote:
> > Il Wednesday 14 November 2007 05:50:36 Willem Buitendyk ha scritto:
> >   
> >> Will Postgresql ever implement an ignore on error feature when bulk
> >> loading data?  Currently it is my understanding that any record that
> >> violates a unique constraint will cause the "copy from" command to halt
> >> execution instead of skipping over the violation and logging it - as is
> >> done in Oracle and DB2.
> >
> > pgloader
> >
> > http://pgfoundry.org/projects/pgloader/
> >
> >   

I believe the last time I tried this, there was still some issues with
it. See attached email. (if it makes it to the list)

--- Begin Message ---
On Mon, 2007-08-27 at 11:27 +0200, Dimitri Fontaine wrote:

> We've just made some tests here with 2.2.1 and as this release contains the 
> missing files, it works fine without any installation.

Yep.. I can confirm that it works.. I am using the csv example.

Goal : similar functionality much like mysql's mysqlimport --replace
(overwrite any rows which has duplicate primary keys)

$ psql pgloader < csv/csv.sql
$ ../pgloader.py -Tvc examples/pgloader.conf csv

pgloader=# alter table csv add primary key (a,b,c);
pgloader=# \d csv
        Table "public.csv"
 Column |     Type     | Modifiers
--------+--------------+-----------
 a      | bigint       | not null
 b      | bigint       | not null
 c      | character(2) | not null
 d      | text         |
Indexes:
    "csv_pkey" PRIMARY KEY, btree (a, b, c)

pgloader=# select * from csv;
    a     |    b     | c  |       d
----------+----------+----+----------------
 33996344 | 33996351 | GB | United Kingdom
 50331648 | 68257567 | US | United States
 68257568 | 68257599 | CA | Canada
 68257600 | 68259583 | US | United States
 68259584 | 68259599 | CA | Canada

$cat csv/csv.data
"2.6.190.56","2.6.190.63","33996344","33996351","GB","Error Kingdom"
"4.17.143.0","4.17.143.15","68259584","68259599","CA","new Country"
<Note : only columns 3 to 6 are taken for loading)

$ psql pgloader < csv/csv.sql
$ ../pgloader.py -vc pgloader.conf csv
Using pgloader.conf configuration file
Will consider following sections:
  csv

[csv] parse configuration
Notice: reject log in /tmp/csv.rej.log
Notice: rejected data in /tmp/csv.rej
[csv] data import
Notice: COPY csv data

Error: Please check PostgreSQL logs
HINT:  double check your client_encoding, datestyle and copy_delimiter
settings

$sudo tail -f /var/log/pglog/postgresxx-xx-xx.log
ERROR:  duplicate key violates unique constraint "csv_pkey"
CONTEXT:  COPY csv, line 1: "33996344,33996351,Error Kingdom,GB"
STATEMENT:  COPY csv (a, b, d, c)  FROM stdin USING DELIMITERS ','

So.. doesn't really solve my issue.
Dang it..


---------------------------(end of broadcast)---------------------------
TIP 5: don't forget to increase your free space map settings

--- End Message ---
---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux