Em 13/06/2016 23:36, Edson Richter escreveu:
Em 13/06/2016 23:18, rob stone escreveu:
On Mon, 2016-06-13 at 22:41 -0300, Edson Richter wrote:Em 13/06/2016 22:33, Edson Richter escreveu:I've a table "A" with 4,000,000 records. I've decided to delete records from oldest to newest but I can't delete records that have references in tables "B", "C" or "D". so, I've with qry as ( select A.id from A where not exists (select 1 from B where B.a_id = A.id) and not exists (select 1 from C where C.a_id = A.id) and not exists (select 1 from D where D.a_id = A.id) and A.creation_date < (now()::date - interval '12 month') order by A.id DESC limit 2000 ) delete from A where id in (select id from qry); All three referenced tables have indexes (B.a_id; C.a_id; D.a_id) in order to make query faster. So for first 2 million rows it worked really well, taking about 1 minute to delete each group of 2000 records. Then, after a while I just started to get errors like: ERROR: update or delete in "A" violates foreign key "fk_C_A" in "C". DETAIL: Key (id)=(3240124) is still referenced by table "C". Seems to me that indexes got lost in the path - the query is really specific and no "C" referenced records can be in my deletion. Has anyone faced a behavior like this? Am I doing something wrong?Of course: Version string PostgreSQL 9.4.8 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-4), 64-bit Oracle Linux 7 x64 with all updates. Running on EXT4 file system. Computer is Dell R420 with mirrored disks, 80GB of RAM (database has < 40GB in total). Sorry for not putting the info in the first e-mail. EdsonWhat does:- SELECT COUNT(*) FROM C WHERE C.a_id = 3240124; return? Is it a many-to-one or a one-to-one relationship?
SELECT COUNT(*) FROM C WHERE C.a_id = 3240124;count
-------
1
(1 registro)
A.id is primary key of A table. Each table has its own primary key.
Relationship to others table is 1-N, being N = {0,1}
A.id -> B.a_id (being B.a_id unique but not enforced by unique key)
A.id -> C.a_id (being C.a_id unique but not enforced by unique key)
A.id -> D.a_id (being D.a_id unique but not enforced by unique key)
Regards,
Edson
Just in case, I've run:
- vacuum full analyze verbose;
- reindex index ix_c_a_id;
Result I get same error. So, I'm inclined to discard that this is a index error.
Interesting:
with qry as (select A.id
from A
where creatingdate < (now()::date - interval '12 month')
and not exists (select 1 from B where B.a_id = A.id)
and not exists (select 1 from C where C.a_id = A.id)
and not exists (select 1 from D where D.a_id = A.id)
order by A.id limit 2000)
select * from qry where id = 3240124;
Why do you assume 3240124 is within the first 2000 qualified records that the CTE is limited to checking?
Total query runtime: 2.2 secs
0 rows retrieved.
Why delete causes error, but querying don't?
Given the error message this answer seems self-evident...
Would it be a bug when using delete ... where id in (subquery)?
I'm unsure regarding the solution but I suspect the problem is that between selecting the A row and deleting it another concurrent process added a record to C that, if you were to re-run the select would cause the row from A to be skipped. But the single query doesn't have that option so it ends up failing.
There is a FOR UPDATE clause you can add to the select but I don't think that works here since table C is the one being altered and at the time of the query there is nothing to lock.
I'm doubting this is a bug, just poor concurrency understanding. Sorry I cannot enlighten further at the moment.
David J.