On Thu, Mar 19, 2020 at 5:35 PM Andres Freund <andres@xxxxxxxxxxx> wrote: > > Hi, > > On 2020-03-19 10:23:48 -0500, Justin King wrote: > > > From a single stats snapshot we can't actually understand the actual xid > > > consumption - is it actually the xid usage that triggers the vacuums? > > > > We have looked at this and the xid consumption averages around 1250 > > xid/sec -- this is when we see the "aggressive" autovac kick off in > > the logs. What I don't understand is why these xid's are being > > consumed at this rate on the databases with no activity (postgres, > > template1). > > The xid counter is global across all databases. Then what does the "age" value represent for each database in this case? Perhaps I'm misunderstanding what I'm looking at? postgres=# SELECT datname, age(datfrozenxid), current_setting('autovacuum_freeze_max_age') FROM pg_database; datname | age | current_setting -----------+-----------+----------------- postgres | 100937449 | 200000000 template1 | 50244438 | 200000000 template0 | 160207297 | 200000000 feedi | 150147602 | 200000000 > > > > What makes you think it is a problem that you have all these vacuums? If > > > you actually update that much, and you have indexes, you're going want a > > > lot of vacuums? > > > I actually don't think there's a problem with the vacuums (I was > > mostly pointing out that they are very regular and not problematic). > > The main problem I am having is that something is causing the > > autovacuums to completely stop and require manual intervention to > > resume -- and it seems to be when the "postgres" or "template1" > > database hits the autovacuum_freeze_max_age. > > Did you look at pg_stat_activity for those autovacuums to see whether > they're blocked on something? This is not something we've done yet but will next time it occurs. > > > > What is interesting is that this happens with the 'postgres' and > > > > 'template1' databases as well and there is absolutely no activity in > > > > those databases. > > > > > > That's normal. They should be pretty darn quick in v12? > > > > Yes, a manual VACUUM FREEZE of either database takes less than 1 > > second -- which is why it's perplexing that the autovac starts and > > never seems to complete and prevents other autovacs from running. > > One big difference between a manual VACUUM and autovacuum is that with > the default settings VACUUM is not throttled, but autovacuum is. > > What are your vacuum_cost_delay, autovacuum_vacuum_cost_delay, > vacuum_cost_limit, autovacuum_vacuum_cost_limit, vacuum_cost_page_hit, > vacuum_cost_page_miss set to? Here are all the vacuum related values for the server: postgres=# select name,setting from pg_settings where name like '%vacuum%'; autovacuum = on autovacuum_analyze_scale_factor = 0.1 autovacuum_analyze_threshold = 2500 autovacuum_freeze_max_age = 200000000 autovacuum_max_workers = 8 autovacuum_multixact_freeze_max_age = 400000000 autovacuum_naptime = 15 autovacuum_vacuum_cost_delay = 20 autovacuum_vacuum_cost_limit = -1 autovacuum_vacuum_scale_factor = 0.2 autovacuum_vacuum_threshold = 500 autovacuum_work_mem = -1 log_autovacuum_min_duration = 0 vacuum_cleanup_index_scale_factor = 0.1 vacuum_cost_delay = 0 vacuum_cost_limit = 1000 vacuum_cost_page_dirty = 20 vacuum_cost_page_hit = 1 vacuum_cost_page_miss = 10 vacuum_defer_cleanup_age = 0 vacuum_freeze_min_age = 50000000 vacuum_freeze_table_age = 150000000 vacuum_multixact_freeze_min_age = 5000000 vacuum_multixact_freeze_table_age = 150000000 I know the database is busy, so the throttling makes sense, but it seems like it would complete eventually. We see blocked autovacs for many hours. > Greetings, > > Andres Freund