Re: [External] Re: postgresql11 space reuse under high delete/update rate

Vijaykumar Jain <vjain@xxxxxxxxxxxxx> · Mon, 4 Mar 2019 23:53:33 +0530

there are various factors where vaccum may not get triggered,
few blogs that i refer to for this are
https://blog.2ndquadrant.com/when-autovacuum-does-not-vacuum/
https://www.cybertec-postgresql.com/en/reasons-why-vacuum-wont-remove-dead-rows/

there may be many more :), but i found my cases from these ones.

If nothing is problematic as above, then

I have been trying to simulate the same using pgbench.
unfortunately, i am not sure of the resources you have, but i am
testing with small resources.

I also have the below config to get more visibility into what gets vacuum.
this would log all the tables that get vacuumed.

log_autovacuum_min_duration = '0'
autovacuum_max_workers = '6'
log_statement_stats = '1'
log_min_duration_statement = '0'
log_lock_waits = 'on'

something likes this:

2019-03-03 17:31:48 UTC LOG:  automatic vacuum of table
"pgtesting.public.pgbench_branches": index scans: 1
› pages: 0 removed, 63 remain, 0 skipped due to pins, 0 skipped frozen
› tuples: 498 removed, 5009 remain, 3 are dead but not yet removable,
oldest xmin: 59684
› buffer usage: 260 hits, 0 misses, 6 dirtied
› avg read rate: 0.000 MB/s, avg write rate: 10.219 MB/s
› system usage: CPU: user: 0.00 s, system: 0.00 s, elapsed: 0.00 s

etc

you can also look at
select * from pg_stat_user_tables  or (pg_stat_all_tables ) to have an
idea of which all tables got vacuumed and which did not.
note : autovacuum settings are (global) for all the tables on the
system i guess.
maybe the scale factor of 0 is triggering vacuum of  smaller tables
that are changing more often, making the large tables run for more
longer.

you can track if the autovacuum is triggered for large tables but not
yet completed, hence no cleanup of reuse i think ?

select * from pg_stat_activity where query ~ 'vacuum';

-[ RECORD 2 ]----+-------------------------------------------------------

datid            | 16384
datname          | pgtesting
pid              | 9575
usesysid         |•
usename          |•
application_name |•
client_addr      |•
client_hostname  |•
client_port      |•
backend_start    | 2019-03-03 17:34:18.509424+00
xact_start       | 2019-03-03 17:34:19.499784+00
query_start      | 2019-03-03 17:34:19.499784+00
state_change     | 2019-03-03 17:34:19.499785+00
wait_event_type  |•
wait_event       |•
state            | active
backend_xid      |•
backend_xmin     | 606793
query            | autovacuum: VACUUM ANALYZE public.pgbench_accounts
backend_type     | autovacuum worker

lastly, you can track the progress of vacuum by this table
(pg_stat_progress_vacuum)
https://www.postgresql.org/docs/10/progress-reporting.html

pgtesting=# select * from pg_stat_progress_vacuum;

-[ RECORD 1 ]------+--------------
pid                | 9575
datid              | 16384
datname            | pgtesting
relid              | 16446
phase              | scanning heap
heap_blks_total    | 8206318
heap_blks_scanned  | 2420813
heap_blks_vacuumed | 0
index_vacuum_count | 0
max_dead_tuples    | 44739242
num_dead_tuples    | 198179

if you have way too many tables, maybe you can set individual table
level vacuuming like
https://www.keithf4.com/per-table-autovacuum-tuning/

btw, with that big a rate of tx,  what do you set as
min_wal_size,
max_wal_size
checkpoint_completion_target
checkpoint_timeout

i guess too frequent checkpointing would also result in a large
resource overhead, and may result is less resources to your other
activities.
It may help others if you can let know what are the resources you are
using for this test (mem/cpu)
do you have replicas? do they have hot_standby_feedback = 'on'
do you have long  'idle in transactions' sessions ?

I hope i am not diverting, but just in case if this helps :).

Thanks,
Vijay

Regards,
Vijay

Regards,
Vijay

On Mon, Mar 4, 2019 at 9:56 PM Ron <ronljohnsonjr@xxxxxxxxx> wrote:
>
> On 3/4/19 4:53 AM, Aliza Abulafia wrote:
>
> Hi
>
>
>
> we are evaluating postgresql 11.1 for our productions.
>
> Having a system with 4251 updates per second, ~1000 delete per second and   ~3221 inserts per second and 1billion transaction per day.
>
> we face a challenge where PostgreSQL does not reuse its (delete/update) space , and tables constantly increase size.
>
> we configured aggressive Autovacuum settings to avoid the wraparound situation. also tried adding periodic execution of vacuum analyze and vaccum –
>
> and still there is no space reuse. (only vacuum full or pg_repack release space to operating system – but this is not a reuse)
>
>
>
> following are our vacuum settings :
>
> autovacuum                          | on
>
> vacuum_cost_limit                   | 6000
>
> autovacuum_analyze_threshold        | 50
>
> autovacuum_vacuum_threshold         | 50
>
> autovacuum_vacuum_cost_delay        | 5
>
> autovacuum_max_workers              | 32
>
> autovacuum_freeze_max_age           | 2000000
>
> autovacuum_multixact_freeze_max_age | 2000000
>
> vacuum_freeze_table_age             | 20000
>
> vacuum_multixact_freeze_table_age   | 20000
>
> vacuum_cost_page_dirty              | 20
>
> vacuum_freeze_min_age               | 10000
>
> vacuum_multixact_freeze_min_age     | 10000
>
> log_autovacuum_min_duration         | 1000
>
> autovacuum_naptime                  | 10
>
> autovacuum_analyze_scale_factor     | 0
>
> autovacuum_vacuum_scale_factor      | 0
>
> vacuum_cleanup_index_scale_factor   | 0
>
> vacuum_cost_delay                   | 0
>
> vacuum_defer_cleanup_age            | 0
>
> autovacuum_vacuum_cost_limit        | -1
>
> autovacuum_work_mem                 | -1
>
>
> How frequently did you manually vacuum?
>
> For example, generate a list of tables with a "sufficient" number of dead tuples, and then manually vacuum them in parallel:
> TABLES=`mktemp`
> psql $DB -c "SELECT '-t', schemaname||'.'||relname
>              FROM pg_stat_all_tables
>              WHERE n_dead_tuples > 500 -- or whatever number you think best
>              ORDER BY 2;" > $TABLES
> vacuumdb --jobs=6 --dbname=$DB `cat $TABLES`
> psql -c "CHECKPOINT;"
>
>
>
> --
> Angular momentum makes the world go 'round.