"Roman F" <romanf@xxxxxxxxxxxx> writes: > ... Executing something like > the following statement would work, but even with indexes it takes an > insane amount of time to execute for each of the tables: > DELETE FROM child_table WHERE parentid NOT IN > (SELECT parentid FROM parent_table) Uh, what sort of query plan are you getting for that? PG 7.4 and up can do a reasonable job with NOT IN if the sub-select is small enough to fit into an in-memory hash table (of size sort_mem). I'm betting that your sort_mem setting is not high enough to encourage the planner to try the hash method. You could try increasing sort_mem ... but given the size of your tables, you might end up with a hash table large enough to drive the system into swapping, in which case it'll still be mighty slow. Another idea is to try an outer join: SELECT child_table.parentid INTO tmp_table FROM child_table LEFT JOIN parent_table ON (child_table.parentid = parent_table.parentid) WHERE parent_table.parentid IS NULL; which essentially does a join and then pulls out just the child_table rows that failed to match. This will probably end up getting done via a merge join or hybrid hash join, either of which are more scalable than the NOT IN code. You still have to do the actual deletions in child_table, but as long as there aren't too many, a NOT IN using tmp_table should work OK. regards, tom lane ---------------------------(end of broadcast)--------------------------- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq