Thanks! But if I set autovacuum_vacuum_scale_factor I am afraid it will make autovacuum less aggressive. It is already not aggressive enough. I am trying to make it more aggressive.
Formula:
Autovacuum VACUUM thresold for a table = autovacuum_vacuum_scale_factor * number of tuples + autovacuum_vacuum_threshold
From: MichaelDBA <MichaelDBA@xxxxxxxxxxx>
Sent: Friday, May 19, 2023 10:48 AM
To: Murthy Nunna <mnunna@xxxxxxxx>
Cc: pgsql-admin@xxxxxxxxxxxxxxxxxxxx
Subject: Re: Vacuum Tuning Question
You turned off autovacuum_vacuum_scale_factor!
Murthy Nunna wrote on 5/19/2023 11:35 AM:
Hi,
I have a fairly large database (several TBs) where auto vacuum is enabled. My focus is to keep the datfrozenxid reasonably low and avoid manual vacuum. When I run following query, I see the percentage creeping up every day. Right after running vacuum manually on tables with large relfrozenxid the following query would return 7% on pgprd1 database but after 5 months it increased to 40%. So, eventually I am afraid I have to vacuum the tables manually which has its own problems like creating massive WALs in a short time etc. I would like to avoid manual vacuuming for this reason.
SELECT freez, txns, ROUND(100*(txns/freez::float)) AS perc, datname
FROM (SELECT foo.freez::int, age(datfrozenxid) AS txns, datname
FROM pg_database d JOIN (SELECT setting AS freez FROM pg_settings WHERE name = 'autovacuum_freeze_max_age') AS foo
ON (true) WHERE d.datallowconn) AS foo2 ORDER BY 3 DESC, 4 ASC;
1500000000 | 599260139 | 40 | pgprd1
1500000000 | 50138249 | 3 | postgres
1500000000 | 50138249 | 3 | template1
Auto vacuum is working as I can see from logs it is repeatedly vacuuming same tables, but I don’t see that as a problem because those are the tables that get updated/inserted continuously.
Following are the settings I have. I am wondering if there is a way autovacuum can keep the above “datfrozenxid” low and not keep increasing. Thank you ahead time for reading my post.
postgres=# select name, setting, unit from pg_settings where name like '%autovacuum%';
name | setting | unit
---------------------------------------+------------+------
autovacuum | on |
autovacuum_analyze_scale_factor | 0.1 |
autovacuum_analyze_threshold | 5000 |
autovacuum_freeze_max_age | 1500000000 |
autovacuum_max_workers | 5 |
autovacuum_multixact_freeze_max_age | 400000000 |
autovacuum_naptime | 60 | s
autovacuum_vacuum_cost_delay | 2 | ms
autovacuum_vacuum_cost_limit | -1 |
autovacuum_vacuum_insert_scale_factor | 0 |
autovacuum_vacuum_insert_threshold | 5000 |
autovacuum_vacuum_scale_factor | 0 |
autovacuum_vacuum_threshold | 5000 |
autovacuum_work_mem | -1 | kB
log_autovacuum_min_duration | 0 | ms
(15 rows)
On Fri, May 19, 2023 at 12:48 PM Murthy Nunna <mnunna@xxxxxxxx> wrote:
I wouldn't recommend completely disabling the scale factor for the entire database. The way you set it, every single table in the database must have 5000 row changes before autovac will kick in. I'd imagine there are system catalogs and other small tables in your DB that never see this many changes for a long time. So leaving the scale factor at something like .10 or .05 for the autovacuum_vacuum_scale_factor is not a bad idea to at least get more of those smaller tables vacuumed more often. Disabling scale factor and only relying on the threshold is better done on a per-table basis.
Also, I wouldn't worry about the datfrozenid that much unless you see it actually exceeding autovacuum_freeze_max_age for extended periods of time. Simply reaching this value isn't really a bad thing and the more aggressive autovac that kicks in usually isn't a big deal unless you're seeing many of your largest tables all reaching it at the same time. The threshold tuning on a per-table basis should help with that. As long as PG is efficiently dealing with tables reaching autovacuum_freeze_max_age, there's really nothing to worry about. Just keep an eye on it via monitoring and you should be fine. See the query in this blog post for an example.
Watch for it reaching 110-125% for alerting, not exactly 100%. Staying consistently above 100% is the problem and means autovac is not keeping up, hence the higher alerting thresholds.
Also keep up with the most recent major version of PG that you can. Txn exhaustion has been getting a lot more attention lately and improving greatly.