Hi, I face the following problem: I have a large table with 12 million addresses, referenced by 20 other tables (some containing about one million entries). There are indexes on the foreign keys. Now I wanted to delete about 10 million addresses (that are not referenced anymore from anywhere), and have a statement like: DELETE
FROM address WHERE
id NOT
IN (SELECT
address_id FROM bank
where address_id
IS
NOT
NULL)
AND id
NOT
IN (SELECT
poboxaddress_id FROM bank
where poboxaddress_id
IS
NOT
NULL)
AND id
NOT
IN (SELECT
address_id FROM bankconnection
where address_id
IS
NOT
NULL) ...lots more...
This takes more than 10 hours here (I had to cancel the statement). I have two suggestions: 1.
Currently for each row to be deleted, a SELECT is done in each column referencing the deleted entry. This takes really a lot of time. It is possible to check in an elegant way if an entry
can be deleted, like in the above query. I know it is not easy to autocreate such a statement, but this would make deletions much faster. 2.
I would have loved a special option “UNREREFENCED” given to the delete statement, so all rows referenced from anywhere would automagically be excluded from my delete statement. When this
keyword is given, no FK checks have to be done, because FK referenciality cannot be violated anyway. DELETE UNREFERENCED
FROM address WHERE …; Thanks for your time and this great database product. Regards, Daniel Migowski |