Re: Options for more aggressive space reclamation in vacuuming?

Paul Smith <paul@xxxxxxxxxx> · Mon, 15 May 2023 00:11:27 +0100

An autovacuum will not release much (if any) space back to the OS. That's perfectly normal. It just marks it as available for new dará to occupy. 

So, if data is being added/removed at a similar rare, the table size will reach an equilibrium with lots of dead tuples that are being reused

If autovacuum is happening too infrequently, then there'll be far too many dead tuples. If it's happening too frequently, then there'll be too much load on the IO system 

Personally, I'd say 0.1 is still too high for a billion+ row table. That's still 100+ million dead tuples. That's a lot, plus it'll be slow to process. I'd use 0.001 (1+ million dead tuples), or maybe even lower. 

Depending on the delete rate, you could even set the scale_factor to 0, and the autovacuum_vacuum_threshold to a number appropriate for an autovacuum every few days. 

Paul

On 14 May 2023 23:20:19 Wells Oliver <wells.oliver@xxxxxxxxx> wrote:

Yeah, We've tuned them down to a 0.1 factor and they are indeed being auto-vacuumed (though perhaps not frequently enough). I feel like I am just seeing the autovacuuming releasing a _lot_ less space than a full vacuum. Of course, I know that this is generally the case, but I am surprised by the factor... Nothing we can't live with, I was just wondering if there wasn't some clever solution.
Perhaps lowering the auto vacuum factor even lower would be useful here too.

On Sun, May 14, 2023 at 3:14 PM Paul Smith <paul@xxxxxxxxxx> wrote:

Have you tuned the autovacuum settings? We've found the defaults to not be great for large tables

Eg autovacuum_vacuum_scale_factor defaults to 0.2, so 20% of the tuples need to be deleted/updated before a vacuum us triggered. If there are a billion rows, that means there needs to be 200 million deleted rows to trigger the autovacuum 

https://www.postgresql.org/docs/current/routine-vacuuming.html#AUTOVACUUM 

Maybe try changing autovacuum_vacuum_scale_factor to 0.001 or something for that large table 

Paul

On 14 May 2023 22:03:41 Wells Oliver <wells.oliver@xxxxxxxxx> wrote:

I think our use-case here is nothing special: just very large tables that are deleted from and re-inserted with some frequency. They store monthly data and exist as partitions. As an example, July of 2022 (summer months are peak) had 1,630,558,336 rows. We delete from this and insert daily. We rarely get new, relevant data once the month is over. The n_dead_tup from pg_stat_all_tables here was over 7m rows, and clearing that out gave us back nearly 50GB, and the file size estimate on this partition was ~200 GB. These tables get auto-vacuumed but clearly it's not, well, aggressively reclaiming space.
Given that they are partitions, VACUUM FULL locks the parent table, so new writes/deletes can't happen, and anyway people like to run queries against these tables quite frequently.

Glad to add any detail I can.

On Sun, May 14, 2023 at 11:32 AM Jeff Janes <jeff.janes@xxxxxxxxx> wrote:
On Sun, May 14, 2023 at 1:46 PM Wells Oliver <wells.oliver@xxxxxxxxx> wrote:
Noticed over the last few days how much more space VACUUM FULL reclaims (for obvious reasons) and I am wondering if there are any solutions for a more fruitful VACUUM without the locks of FULL?

There are lots of theories which are all compatible with the minimal amount of info given here.  If the space is freed from the indexes, then just reindexing (which can be done concurrently in recent versions) could reclaim the space. If it is from the table itself then something like pg_squeeze (https://github.com/cybertec-postgresql/pg_squeeze/) might help. Or you can implement your own squeezing, by using a CTE with a ctid scan to delete tuples from the end of the table and reinsert them in the front, then use regular vacuum to truncate the now-free space off the end of the table.

I would think the more fruitful VACUUM would be to prevent the bloat from occuring in the first place but without knowing the root cause I can't offer anything but platitudes there.

Cheers,

Jeff

-- 
Wells Oliver
wells.oliver@xxxxxxxxx

-- 
Wells Oliver
wells.oliver@xxxxxxxxx