On Sun, 2023-07-30 at 23:00 -0600, peter plachta wrote: > Background is we're trying a pg_repack-like functionality to compact a 500Gb/145Gb > index (x2) table from which we deleted 80% rows. Offline is not an option. The table > has a moderate (let's say 100QPS) I/D workload running. > > The typical procedure for this type of thing is basically CDC: > > 1. create 'log' table/create trigger > 2. under SERIALIZABLE: select * from current_table insert into new_table > > What we're finding is that for the 1st 30 mins the rate is 10Gb/s, then it drops to > 1Mb/s and stays there.... and 22 hours later the copy is still going and now the log > table is huge so we know the replay will also take a very long time. > > === > > Q: what are some ways in which we could optimize the copy? > > Btw this is Postgres 9.6 > > (we tried unlogged table (that did nothing), we tried creating indexes after > (that helped), we're experimenting with RRI) Why are you doing this the hard way, when pg_squeeze or pg_repack could do it? You definitely should not be using PostgreSQL 9.6 at this time. Yours, Laurenz Albe