Re: Problem with aborting entire transactions on error

Zbigniew <zbigniew2011@xxxxxxxxx> · Mon, 10 Dec 2012 18:29:49 +0000

2012/12/10, Scott Marlowe <scott.marlowe@xxxxxxxxx>:

> Databases aren't as simple as you imagine.  What you're basically
> asking for from the pg engine is for it to enclose every insert into a
> subtransaction (i.e. set a savepoint) to check for an error.

No, I'm not.

It's able (I mean Postgres) to detect an error, and it's able to react
on error. "What I'm basically asking for" is an option to change its
reaction; that's all. Just to change a bit the thing already exisiting
- not to add new one.

> The overhead cost of doing this in pgsql is not cheap.  Now what I would
> use for this would be some tool written to interface with pgsql and do
> the job for you, like pgloader.  It gives you the simple interface to
> do what you're asking.

You see? The people had to create sophisticated loaders, just to work
around the problem.

> The idea that everything like this belongs in the database,

But why "everything"? My question was about one specific thing, and I
quite clearly (I think so...) explained, why. Not to "ignore errors",
not to "add savepoints" - just to proceed with this, which it's
already doing (skipping faulty query), but NOT aborting whole
transaction - as an optional behaviour.

> particularly coded by the core developers isn't how things generally
> get done in pgsql.  We've got a small sharp team of developers working
> on BIG things, like covering indexes, HOT updates, query planner
> optimizations and so on.  Fixing this problem is core probably doesn't
> tickle anyone's itch, and there are so many ways to work around it in
> pg that would not work as well in other dbs, like loading to a temp
> table and updating in one big query.  That method might run an oracle
> db out of rollback space etc, but on pg it's the way it's done.

Just try to make a Google search on "postgresql abort transaction" -
you'll see, how much trouble this solution makes to the users. Pay
attention, that I didn't even mind "...on duplicate". How much time
the people are spending 1. To find out, what actually is going on
("such thing was working on X and Y"), 2. To look for solution
(non-existing), and then... 3. ...to work around the problem with
"savepoints" or other way.

All this can be spared to users just by adding an OPTION. This option
can be "off" by default. There can be even a description in manual:
"We discourage...", "...use it only, when you know, what are you
doing...", "...you have been warned". But I bet, many users would
appreciate this. Besides: as you noticed (and the others too), several
other servers allows this. Then maybe really it's not that evil thing,
as you think?

> The thing that often happens is that people learn to do things a
> certain way on another db and then think that that experience should
> translate to postgresql straight across.  It often does not, because
> postgresql is so different in many ways.
>
> Keep in mind with pg transactions you can put almost anything into a
> transaction, and the only real limit to the size of a transaction you
> can run in it is the size of your hard drives.  This alone lets you
> approach problems from a very different perspective than on most other
> dbs.

Yes, and exactly because of this I'm pretty sure, that you understand,
why I don't want to trash such long transaction. Why I would to commit
it - just skipping the dupes.

> Lastly it's a free (as in beer, as in freedom) database.  People work
> on it as much out of love as for money, and if you want to get
> traction for a someone else to volunteer their time to make a change,
> YOU have to make the case. I'm not a pg hacker.  But I've made a case
> several times and gotten someone to make a change a few times for me.
> If you think you have a case here you'll have to convince a lot of
> people who don't think you do.

Yes, I agree: it costs implementation time. But pay attention: few
afternoons (maybe I'm wrong, but I doubt, this could take more to
experienced PG-developers) of your team - and on the other hand, the
time wasted by thousands of people, who are looking how to do the
things that were so simple while using some other server (even Oracle,
as has been mentioned). And in the future THERE STILL WILL BE THE
SAME. And it can be spared to all this people just by adding an
option.

No idea, is it simple or very complicated. But if it is simple - why not?

Maybe someone of the devs team could answer this doubt: is the
proposed change simple to implement?
-- 
regards,
Zbigniew

-- 
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general