Caveat: I am not a PostgreSQL hacker, and have not looked into its internals at all, though I've read a number of excellent articles and blog posts on some of its features (TOAST, HOT updates, MVCC, etc). I'm a programmer who has made use of PG from a number of languages, and formed a strong opinion on the high quality and usability of Pike's bindings. On Tue, Dec 11, 2012 at 5:29 AM, Zbigniew <zbigniew2011@xxxxxxxxx> wrote: > 2012/12/10, Scott Marlowe <scott.marlowe@xxxxxxxxx>: > >> Databases aren't as simple as you imagine. What you're basically >> asking for from the pg engine is for it to enclose every insert into a >> subtransaction (i.e. set a savepoint) to check for an error. > > No, I'm not. > > It's able (I mean Postgres) to detect an error, and it's able to react > on error. "What I'm basically asking for" is an option to change its > reaction; that's all. Just to change a bit the thing already exisiting > - not to add new one. You're thinking that it's easy to just "back out" something that failed. Perhaps this is true in the simple case where one statement does one insert and that's all, but if this is to be a PostgreSQL feature, it needs to handle the more complicated cases. In each of these examples, what should be kept and what should be undone? (Let's suppose that your language has a query() function that executes an SQL query while handwaving questions of which database connection to use.) Row ID 3 already exists, in each case. query("insert into table (id) values (1),(2),(3)"); query("insert into table (id) select id from other_table where id in (1,2,3)"); query("insert into table (id) values (1); insert into table (id) values (2); insert into table (id) values (3)"); query("create trigger after update on other_table for each row insert into table values (new.id); update other_table set dummy=dummy+1 where id in (1,2,3)"); I'm pretty sure the example trigger would need to be rewritten as a procedure, but you get the idea. There's a LOT more complexity than just "if Postgres detects an error, it should (optionally) just not do that bit". >> The overhead cost of doing this in pgsql is not cheap. Now what I would >> use for this would be some tool written to interface with pgsql and do >> the job for you, like pgloader. It gives you the simple interface to >> do what you're asking. > > You see? The people had to create sophisticated loaders, just to work > around the problem. Any good project will work in layers. The core PostgreSQL engine doesn't have support for the special backslash commands that function at the command-line interface; they're implemented in the psql client. Nor does PG need any code to handle the peculiarities of network routing, because that's all managed by the TCP/IP sockets layer. Often the best thing to do is to bolt something onto the outside, because that adds zero complexity for everyone who isn't using this feature. > All this can be spared to users just by adding an OPTION. This option > can be "off" by default. There can be even a description in manual: > "We discourage...", "...use it only, when you know, what are you > doing...", "...you have been warned". But I bet, many users would > appreciate this. Besides: as you noticed (and the others too), several > other servers allows this. Then maybe really it's not that evil thing, > as you think? Are you a programmer? Are you aware how much complexity each option adds? Every single combination must be tested and debugged. In this instance, that means testing every part of Postgres before and after several types of failure, to make sure everything works correctly in both cases. That is not cheap. And then there's the user-facing complexity (documenting the option, explaining when it's useful, etc), and now everyone has to decide whether or not to use it. Also not cheap. > Yes, and exactly because of this I'm pretty sure, that you understand, > why I don't want to trash such long transaction. Why I would to commit > it - just skipping the dupes. The normal way to use transactions is to guarantee atomicity (the first letter from ACID). The classic example is a money transfer - you deduct $100 from one account and add $100 to another. Naturally you want this to be all-or-nothing - it's better to leave the money where it is than to leave the account-holder (or the bank, if you do them in the other order) short a hundred bucks. What you're doing is fiddling with transactions as a means of improving performance, and then discovering that they aren't exactly what you want. Have you considered looking into some other methods of improving performance? You may find that there's a completely different way to achieve your goal. > Maybe someone of the devs team could answer this doubt: is the > proposed change simple to implement? I'm not a PG dev, but I've fought the battle against complexity in enough other situations that I know that it's much more usual to underestimate than overestimate the cost. ChrisA -- Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general