Hi,
I have a big table (with 1.6 milion records). One of the columns is called end_date and it`s type is timestamp. I'm trying to find the best way to delete most of the table but not all of it according to a range of dates. The table structure :
afa=# \d my_table;
Table "public.my_table"
Column | Type | Modifiers
---------------------------------+--------------------------+----------------------------------------------------------
id | bigint | not null default nextval('my_table_id_seq'::regclass)
devid| integer | not null
column_name| integer | not null
column_name| integer | not null
column_name| integer | not null
column_name| timestamp with time zone |
column_name| integer | not null
column_name| integer | not null
column_name| integer | not null
column_name| integer | not null
column_name| integer | not null
column_name| integer | not null
column_name| integer | not null
column_name| text | not null
column_name| integer | not null default 0
column_name| integer | not null default 0
column_name| integer | not null default 0
column_name| integer | not null default 0
column_name| integer | not null default 0
column_name| integer | not null default 0
column_name| integer | not null default 0
column_name| integer | default 0
column_name| integer | default 0
column_name| integer | default 0
column_name| integer | default 0
column_name| integer | default 0
column_name| integer | default 0
end_date | timestamp with time zone |
Indexes:
"my_table_pkey" PRIMARY KEY, btree (id)
"my_table_date_idx" btree (date)
"my_table_device_idx" btree (devid)
"end_date_idx" btree (end_date)
Foreign-key constraints:
"fk_aaaaa" FOREIGN KEY (devid) REFERENCES device_data(id)
Referenced by:
TABLE "table1" CONSTRAINT "application_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
TABLE "table2" CONSTRAINT "configuration_changes_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
TABLE "table3" CONSTRAINT "fk_57hmvnx423bw9h203260r8gic" FOREIGN KEY (my_table) REFERENCES my_table(id)
TABLE "table3" CONSTRAINT "interface_change_my_table_fk" FOREIGN KEY (my_table) REFERENCES my_table(id)
TABLE "table4" CONSTRAINT "my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id) ON DELETE CASCADE
TABLE "table5" CONSTRAINT "my_table_report_my_table_fk" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
TABLE "table6" CONSTRAINT "my_table_to_policy_change_my_table_foreign_key" FOREIGN KEY (my_table) REFERENCES my_table(id)
TABLE "table7" CONSTRAINT "network_object_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
TABLE "table8" CONSTRAINT "orig_nat_rule_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
TABLE "table9" CONSTRAINT "risk_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
TABLE "table10" CONSTRAINT "rule_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
TABLE "table11" CONSTRAINT "service_change_my_table_id_fkey" FOREIGN KEY (my_table_id) REFERENCES my_table(id)
As you can see alot of other tables uses the id col as a foreign key which make the delete much slower.
Solution I tried for the query :
delete from my_table where end_date <= to_date('12/12/2018','DD/MM/YYYY') and end_date > to_date('11/12/2018','DD/MM/YYYY');
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Delete on my_table (cost=0.00..65183.30 rows=1573862 width=6) (actual time=5121.344..5121.344 rows=0 loops=1)
-> Seq Scan on my_table (cost=0.00..65183.30 rows=1573862 width=6) (actual time=0.012..2244.393 rows=1572864 loops=1)
Filter: ((end_date <= to_date('12/12/2018'::text, 'DD/MM/YYYY'::text)) AND (end_date > to_date('11/12/2018'::text, 'DD/MM/YYYY'::text)))
Rows Removed by Filter: 40253
Planning time: 0.210 ms
Trigger for constraint table1: time=14730.816 calls=1572864
Trigger for constraint table2: time=30718.084 calls=1572864
Trigger for constraint table3: time=28170.363 calls=1572864
Trigger for constraint table4: time=29573.681 calls=1572864
Trigger for constraint table5: time=29629.263 calls=1572864
Trigger for constraint table6: time=29628.489 calls=1572864
Trigger for constraint table7: time=29798.121 calls=1572864
Trigger for constraint table8: time=29645.705 calls=1572864
Trigger for constraint table9: time=29657.177 calls=1572864
Trigger for constraint table10: time=29487.054 calls=1572864
Trigger for constraint table11: time=30010.978 calls=1572864
Trigger for constraint table12: time=26383.924 calls=1572864
Execution time: 350603.047 ms
(18 rows)
-----------------------
DELETE FROM my_table WHERE id IN (select id from my_table where end_date <= to_date('12/12/2018','DD/MM/YYYY') and end_date > to_date('11/12/2018','DD/MM/YYYY'));
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Delete on my_table (cost=92522.54..186785.27 rows=1572738 width=12) (actual time=9367.477..9367.477 rows=0 loops=1)
-> Hash Join (cost=92522.54..186785.27 rows=1572738 width=12) (actual time=2871.906..5503.732 rows=1572864 loops=1)
Hash Cond: (my_table.id = my_table_1.id)
-> Seq Scan on my_table (cost=0.00..49052.16 rows=1613116 width=14) (actual time=0.004..669.184 rows=1613117 loops=1)
-> Hash (cost=65183.32..65183.32 rows=1572738 width=14) (actual time=2871.301..2871.301 rows=1572864 loops=1)
Buckets: 131072 Batches: 32 Memory Usage: 3332kB
-> Seq Scan on my_table my_table_1 (cost=0.00..65183.32 rows=1572738 width=14) (actual time=0.009..2115.826 rows=1572864 loops=1)
Filter: ((end_date <= to_date('12/12/2018'::text, 'DD/MM/YYYY'::text)) AND (end_date > to_date('11/12/2018'::text, 'DD/MM/YYYY'::text)))
Rows Removed by Filter: 40253
Planning time: 0.419 ms
Trigger for constraint my_table_id_fkey: time=14291.206 calls=1572864
Trigger for constraint table2_fk: time=29171.591 calls=1572864
Trigger for constraint table3_fk: time=26356.711 calls=1572864
Trigger for constraint table4_fk: time=27579.694 calls=1572864
Trigger for constraint table5_fk: time=27537.491 calls=1572864
Trigger for constraint table6_fk: time=27574.169 calls=1572864
Trigger for constraint table7_fk: time=27716.636 calls=1572864
Trigger for constraint table8_fk: time=27780.192 calls=1572864
....
....
Execution time: 333166.233 ms ~ 5.5 minutes
(23 rows)
Loading into a temp table the data isnt option because I cant truncate the table because of all the dependencies...
Any idea what else can I check ?