Please reply-to the list, not just myself. > -----Original Message----- > From: Zbigniew [mailto:zbigniew2011@xxxxxxxxx] > Sent: Monday, December 10, 2012 6:26 AM > To: David Johnston > Subject: Re: Problem with aborting entire transactions on error > > 2012/12/10, David Johnston <polobo@xxxxxxxxx>: > > >> A simple - but very common - example: I wanted to perform really > >> large number of inserts - using transaction, to make it faster - > >> while being sure the duplicate entries will be skipped. > > > > And what happens if one of those inserts errors out for reasons other > > than a duplicate? > > Then maybe there could be something like > "DON'T_ROLLBACK_ON_DUPLICATE_ERROR" > instead of "more general" directive? > > > Since current belief is that such behavior is unwise > > No idea, why common belief is "allowing the user to decide is unwise". > Is the "common assumption", that the users of Postgres are children > 4-5 years old? If it requires coding something to provide the user the desired flexibility then whether or not such flexibility is wise or unwise is going to go into said decision. Also, since you are begging others to solve your own problems you are in many ways behaving as a child. Especially with an Open Source project like PostgreSQL adults are welcomed and encouraged to solve their own problems by altering the source code and, in the spirit of community, contributing it back to the project (contribution also means you do not have to maintain your own custom version of the software). Please do not take this as a personal affront; I just simply wanted to extend on the analogy that you put forward. I fully recognize the need to ask the community for help myself (I am also not a coder) but in doing so I know I am imposing on the charity of others and as such need to make best efforts to understand and respect their point-of-view and readily acknowledge my own ignorance in the matter. > > > no one is willing to give their time to doing so. > > Pay attention, that actually my (and not just mine) request isn't to write > some new, sophisticated code - but just a little change in existing code > instead. If this is your belief then do it yourself. Your response to my question above already indicates that what you initially thought was very simple had at least one complication you did not understand. There are likely others. > > > I'm not sure whether that means that if you supplied such a patch it > > would be rejected. Since their is a correct way to solve the > > duplicates scenario (see below) without savepoints you may wish to > > supply another example if you want to try and convince people. > > I'm afraid, I don't understand. How many examples can I supply for the > problem, like: > 1. I need only to avoid duplicated rows (the ones with duplicated primary > key). > 2. I don't need any "stronger" reaction for such error - since I know, there > won't be any troubles. > 3. Since server controls primary key, all I need is to make it a bit less > "nervous" about such attempt to insert dupe, only dropping this one (and > not everything else together with single faulty entry)? > > Whatever will I supply, the scenario basically will be the same - I just would to > avoid such "throwing the baby out with the bathwater". > > > The true solution is to insert into a staging table that allows > > duplicates (but ideally contains other constraints that you do care > > about) and then de-dupe and insert into your final table. > > ...which means twice as work for the server (AT LEAST twice). When all I > needed was just to skip dupes, not aborting the transaction. And it could be > perfectly done. > > > This may be an undesirable instance of the database not allowing you > > to shoot yourself in the foot but as ignoring errors is bad practice > > motivation to allow it is small. > > I'm NOT proposing to just ignore errors - read carefully. please - I'm writing > about letting user decide, what impact should an error have: > - only dropping "faulty" query, or... > - ...aborting the entire transaction (which is undesirable in many situations) Except for ETL aborting is almost universally the desired behavior. I'll grant you your arguments for why, during ETL, this feature could be useful (as opposed to performing duplicate work), but I still think that a blind "ignore duplicates and leave the first instance in place" ETL policy is a bad one in almost all cases. > > > You should always be able to import the data without errors into an > > explicitly defined table and then write queries to convert between the > > input format and the final format - explicitly making no coding the > > necessary translation decisions and procedures. In that way you > > always know that your import routine is always working as expected and > > not guessing whether it was the known error condition or an unknown > > condition the caused a record to go missing. > > Well, looking for solution yesterday I've found several messages of other > people, > who complained about the same: that simple duplicate record is > aborting whole transaction. This makes me think, that indeed this is quite > common case, when limiting the "server response" (and NOT "ignoring the > error" completely, which could mean insertion of the duplicated row), could > be very useful. There are situations, when all we need is just avoid dupes - > that's why we have primary keys - and "too much protection" makes the > work unnecessarily difficult. > -- > regards, > Zbigniew It is easy to complain but apparently no one feels strongly enough to either code a solution themselves or sponsor someone else to do so. As I have not seen any core coders respond I cannot be certain whether there are underlying technical issues preventing this but there is at the least a resource allocation concern since neither code donors nor those sponsored by clients have made the time to implement this "simple feature". It may be more productive, not being a core coder yourself, to simply ask why such a feature has not been implemented given the apparent demand instead of asserting (from ignorance) that such an implementation should be very simple to accomplish. The later approach (as well as your response to me - personally) is much more confrontational and contrary (the direct reply at least) to the posted etiquette for this community. David J. -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general