Search Postgresql Archives

Re: Bulk Load Ignore/Skip Feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all,

Le Friday 16 November 2007 18:04:44 Willem Buitendyk, vous avez écrit :
> Martijn van Oosterhout wrote:
> > On Thu, Nov 15, 2007 at 08:09:46PM -0800, Willem Buitendyk wrote:
> >> Damn - so the unqiue contraint is still an issue.  What gives?  Why is
> >> it so hard to implement this in Postgresql?  sigh - if only I had more
> >> time.
> >
> > Can you explain? The server ofcourse still generates error messages in
> > the logs, there's no way around that. However it looks to me that the
> > data ended up in the database correctly? Or did I miss something?

pgloader will load non-conflicting data and produce both a reject log with 
errors about non inserted (COPYied) data and a reject data file with the 
input line ready to be processed again if such is the operator choice.

But at the moment it does not provide any way to automate the UPDATE the PK 
conflicting rows. I'm really hesitant as to code this option: what to do in 
the case of a non primary key unique constraint conflict:

dim=# create table unic(a integer unique);
dim=# insert into unic values (1);
INSERT 2312559 1
dim=# insert into unic values (1);
ERROR:  duplicate key violates unique constraint "unic_a_key"

dim=# create table pk(a integer primary key);
dim=# insert into pk values (1);
INSERT 2312565 1
dim=# insert into pk values (1);
ERROR:  duplicate key violates unique constraint "pk_pkey"

I'm thinking maybe in the first case you don't want existing values to be 
overwritten, but in the second case it's what you want to happen. Should this 
be the user responsibility to make the difference --- by configuring pgloader 
properly --- or should the tool try hard to protect the user against himself?

How to act on a table with a surrogate pk and a unique constraint when you 
want to automatically update surrogate key but not the unique data, or the 
other way around?

So I have two questions for the community:
 - should I provide a pgloader mailing list?
 - what do you think about adding the UPDATE-on-duplicate-key-error option?

> My apologies.  I misinterpreted that last post.  I have not been able to
> try pgloader as I am using the windows platform.

pgloader is a python "script" which depends on psycopg for handling the 
PostgreSQL connection, and only standard python modules after that. The 
following link provides windows binaries for psycopg.
  http://www.stickpeople.com/projects/python/win-psycopg/

I've gotten reports of pgloader running on windows, even if I didn't make any 
specific effort for this to happen and I don't have any proprietary licenced 
OS to test pgloader on.

Hope this helps,
-- 
dim

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
       subscribe-nomail command to majordomo@xxxxxxxxxxxxxx so that your
       message can get through to the mailing list cleanly


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux