On Mon, 2022-11-28 at 04:05 +0000, senor wrote: > I'm a little late getting back to this but still have no solution. > > I had inadvertently executed a vacuum freeze on a table that pg_stat_activity > indicated was currently being autovacuumed. The manual vacuum succeeded while > leaving the previous autovacuum still showing active in pg_stat_activity. No, that cannot be. VACUUM processes are mutually exclusive. It could have been another autovacuum worker (different "pid"). > Manual vacuum always completes, never stalls, but also often does not appear > in pg_stat_progress_vacuum unless it's a longer process. It *does* appear in "pg_stat_progress_vacuum", but perhaps you are too slow to catch it. > It appears the process completes the vacuum but does not register that fact. > relfrozenxid of the main table is what would be expected but an associated > toast table was still very old. Cancelling all pending vacuums of the table > and manually running vacuum completes in a few seconds and both the main > relation and toast are updated as expected with last vacuum time updated > in pg_stat_all_tables. Obviously autovacuum devided to process the main table, but not the TOAST table. That is normal. Manual VACUUM processes both, unless you specify the option PROCESS_TOAST OFF. > Autoacuum and autovacuum analyze both get hung. No, they don't; not unless they are anti-wraparound autovacuum runs and you are holding high locks on the table in a long running transaction. They may be slow to complete, because in v11 autovacuum by default is very slow indeed, as "autovacuum_vacuum_cost_delay" is 20ms. > I often see the same table listed multiple times in pg_stat_activity with > different pids and: > state: active > backend_type: autovacuum worker > wait_event_type: blank > state_change: 10-15 seconds after backend_start - about how long it takes > to manually run vacuum on most tables. That can be parallel workers that are used to scan indexes. > What exactly does autovacuum rely on in the stats file? I ran strace on a > hung autovacuum process and saw a repeated read of > /run/postgresql/db_16384.stat (tempfs). The file is 740MB which is about > the same as other similar installations I've reviewed. I'm lacking in overall > experience in this though. I don't know either, and I am too lazy to read the code on that, but I suspect that it has no connection to your problem. > One final oddity: > I ran a query for oldest relfrozenxid and redirected to file. The query took > around a minute. A few seconds after it finished, I queried for 'active' in > pg_stat_activity and the oldest relfrozenxid query was still listed. > A few seconds later it had cleared. That sounds weird and is hard to believe. Are the disk or the CPU under extreme stress? Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com