Re: Need an idea to operate massive delete operation on big size table.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks everyone for providing the ideas and special thanks to Rui DeSousa.

On Thu, 16 Jan 2025 at 05:39, Rui DeSousa <rui.desousa@xxxxxxxxxx> wrote:


On Jan 15, 2025, at 4:24 PM, Alex Balashov <abalashov@xxxxxxxxxxxxxxx> wrote:

n my experience, mass deletions are tough. There may be a supporting index to assist the broadest criteria, but the filtered rows that result must still be sequentially scanned for non-indexed sub-criteria[1]. That can still be an awful lot of rows and a huge, time-consuming workload. 



This is how easliy do it. I’ve used this method many times over.  Create a purge queue table and a function to purge the records in batches.  

1. I normally try to keep the purge transaction to 500ms or less.  i.e. purge 1000 or 5000 records, etc.  if depends on the schema and cascading deletes, etc.
2. The function is setup to run in parallel; so kick off as many purge sessions as your system can handle.

Write a perl or shell script to just keep executing “select purge_date();” repeatedly until zero is returned.  Make sure to only call the function once in a given transaction.

You shouldn’t have to create any additional indexes or disable them while purging.  The application can continue to operate without issues and when the process is complete; vacuum and rebuild indexes on the effected tables.



create table purge_data_queue (
data_id bigint not null primary key
);

/* Identify all records to be purged */
insert into purge_data_queue (data_id)
select data_id
from data
where <records to delete>
;

create or replace function purge_data (_limit int default 1000)
returns int
as
$$
declare
_rowcnt int;
begin
create temp table if not exists purge_set (
data_id bigint not null
, primary key (data_id)
) on commit drop
;

/* Identify batch to be purged */
insert into purge_set (
data_id
)
select data_id
from purge_data_queue
limit _limit
for update skip locked
;

/* Delete batch from base table */
delete from data using purge_set
where data.data_id = purge_set.data_id
;

get diagnostics _rowcnt = ROW_COUNT;

/* Delete batch from queue table */
delete from purge_data_queue using purge_set
where purge_data_queue.data_id = purge_set.data_id
;
return _rowcnt;
end;
$$ language plpgsql
set search_path = schema_name
;


--
Thanks & Regards
Gambhir Singh


[Index of Archives]     [Postgresql Home]     [Postgresql General]     [Postgresql Performance]     [Postgresql PHP]     [Postgresql Jobs]     [PHP Users]     [PHP Databases]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Yosemite Forum]

  Powered by Linux