Re: MOVE

PFC <lists@xxxxxxxxxxxxxxxxxxxxx> · Sat, 15 Jan 2005 01:07:47 +0100

BEGIN;
INSERT INTO archive (...) SELECT ... FROM active WHERE user_id = ...;
DELETE FROM active WHERE user_id = ...;
COMMIT;

The DELETE can only delete the rows returned by the select, that's the
whole point of transactions...

	Well, in the case of having a unique index on user_id, and if no-one  
updates the row between the insert and the delete, it will work ;)

	And if someone updates it in between, well, the update is not archived,  
so you have to LOCK your row FOR UPDATE.

	And if this procedure is called twice at the same time, the rows will be  
historized twice...

	etc...

	Which is precisely why I don't like this approach !

	As a side note, I've installed 8.0 rc, and wow. The slow queries feel a  
lot faster on the command prompt, my small queries have became faster  
too... very good work !

	In the end, I've implemented it with an AFTER DELETE trigger on the  
'live' table, which, after the row has been deleted, inserts it in the  
history table, using the magic variable OLD. This will work because the  
row is already deleted, thus can't be concurrently updated by another  
transaction (because a transaction trying to update a row will wait on the  
lock acquired by the DELETE, and vice versa).

	So, for ONE row at a time, in a trigger, it works beautifully, thank you  
postgres ! I find the solution very elegant : when a session expires, it  
is deleted from the live session table, then the trigger catches it just  
in time to shove it in the sessions history table, then some other tables  
like user-to-chatroom connections, which happen to have a user_id  
referencing into the live sessions table, get the ON DELETE CASCADE and  
are also purged and historized automatically. I am very happy with this  
solution BUT it's done one-row-at-a-time, so it's slower than I'd like !

	The key is to insert the row deleted from the live table into the history  
table AFTER it has been deleted, to avoid all funky locks problems. Now  
consider the following : you must DELETE several items at the same time  
and historize them.

	If you INSERT then DELETE:

		- records can be inserted, updated or deleted between the two. The  
inserted ones will be historized but not deleted (duplicates !), the  
deleted ones will be lost forever, unhistorized, the updated ones won't  
have their updates historized. Not very well for concurrecy !

	You can LOCK FOR UPDATE before, this solves the UPDATE and DELETE  
problem, but not the INSERT problem.

	You can, of course, lock the entire table, but well, it reminds me too  
much of the MySQL way.

	You can also use SERIALIZABLE mode which solves all the problems, but if  
something goes wrong, everything fails and you have to retry the whole  
trasaction, whereas a proper lock would be waited on...

	If there is a primary key in the 'live' table you can SELECT FOR UPDATE  
into a tamporary table, then delete using the pkeys in the temp table,  
then insert from the temp table... ugly !

	That's why I bother you to have the possibility of DELETE returning the  
DELETE'd rows ;)

	It's not very useful if you process one row, but when you process several  
at a time, it would be really great, because instead of 2*N queries  
(DELETE+INSERT hidden in a trigger) you'd just do one (INSERT ... DELETE  
AND SELECT ... FROM ...). Today, if you don't want to do it in a trigger,  
you have to have a unique index, SELECT FOR UPDATE, INSERT, DELETE, that's  
three queries per row.

	In a perfect world, this would be then used in an ON DELETE RULE which  
would replace the DELETES by deletes inserting the rows into the history  
table

	Also I've thought about some other interesting applications, if DELETE  
returns rows, why not UPDATE or even INSERT ?

	Many applications use INSERT... then SELECT currval(sequence). I also  
like to set defaults in the database, like for instance some rows which  
have timestamp fields defaulting to now() or things like that. I have a  
tree table with a ltree field which is generated by a trigger from the  
parent's path and the current row's id. Some other fields are also  
inherited from the parent. Why not do INSERT INTO ... AND SELECT ... which  
would return the sequence field, and any other fields which have been  
initialized by ON INSERT triggers... this would be neat... instead of  
INSERT, SELECT currval, SELECT .. FROM table WHERE id=...

	Same thing for on update triggers.

	You could replace some plpgsql procedures with one query, and what's more  
important, not worry about locking headaches.

	Anyway, my problem is solved now with triggers, but I like the idea very  
much (and Oracle has it) (and Tom once said a DELETE was just more or less  
like a SELECT)... so ...

	Regards

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings