Re: Autovacuum is cleaning very less dead tuples

Laurenz Albe <laurenz.albe@xxxxxxxxxxx> · Fri, 27 Sep 2019 07:55:06 +0200

On Fri, 2019-09-27 at 11:10 +0530, Amarendra Konda wrote:
> As part of vacuum tuning, We have set the below set of parameters. 
> 
> > select relname,reloptions, pg_namespace.nspname from pg_class join
> pg_namespace on pg_namespace.oid=pg_class.relnamespace where relname
> IN('process_instance') and pg_namespace.nspname='public';
>    relname    |                                                      
>                reloptions                                            
>                          | nspname
> --------------+----------------------------------------------------
> -------------------------------------------------------------------
> -------------------------------+---------
>  process_instance |
> {autovacuum_vacuum_scale_factor=0,autovacuum_vacuum_threshold=20000,a
> utovacuum_vacuum_cost_limit=1000,autovacuum_vacuum_cost_delay=10}    
>           | public

That's not so much tuning as breaking.

You have set autovacuum to run all the time at a snail's pace.
That way, it will have trouble getting any work done.

Don't touch autovacuum_vacuum_scale_factor and
autovacuum_vacuum_threshold. Don't raise autovacuum_vacuum_cost_limit.
If anything, lower autovacuum_vacuum_cost_delay.

> However n_dead_tup value from pg_stat_user_tables was always showing
> very high value. Most of the time, it is greater than 100K dead
> tuples.  

That is only a problem if the number of live tuples is less than
500000.

> Overall, we couldn't able to correlate on why autovacuum was able to
> cleanup only < 2K tuples, even though there are mode dead tuples
> based on the statistics ?  Can you please explain on why we are
> notcing huge difference and what steps needs to taken to minimize the
> gap ?

It is questionable if there is a problem at all.

> Log message
> 
>  2019-09-25 00:06:31 UTC::@:[80487]:LOG:  automatic vacuum of table
> "fc_db_web_2.public.process_instance": index scans: 1
> pages: 0 removed, 854445 remain, 0 skipped due to pins, 774350
> skipped frozen
> tuples: 1376 removed, 16819201 remain, 21 are dead but not yet
> removable
> system usage: CPU 44.57s/33.04u sec elapsed 5088.65 sec

This shows that at least this table has no problem.

Even with your settings, autovacuum finished in 5 seconds and
could clean up almost everything.

> Live and Dead tuples
> 
> select relname, n_live_tup, n_dead_tup FROM pg_stat_user_tables WHERE
> relname='process_instance';
>    relname    | n_live_tup | n_dead_tup 
> --------------+------------+------------
>  conversation |   16841596 |     144202

Perfect.  There is no problem at all.

The table has less than 20% dead tuples, so everything is in perfect
order.

Just stop fighting windmills.

Give up your tuning attempts and reset all parameters back to the
default.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com