>From my experience is good practice to: - Monitor the age of your records (we use icinga/nagios for that) to know on time when things are going wrong - Schedule a periodic VACUUM of your db, if possible. Eg: if your db is heavily used during the day but resting at night, then during the night is a good idea to run a VACUUM. This will also offload your db during the day, because less autovacuum jobs will be triggered. As additional note, if I remember correctly, a vacuum will be triggered to prevent wraparound also if you set 'autovacuum=off' in your configuration. That specific vacuum will have priority over normal vacuums. For statistical purposes, a good formula to understand your actual 'burn rate' is to periodically run: SELECT * FROM txid_current(); and compare results over time to have a good estimate on how many xid you are using daily. Regards, Fabio On 03/06/2018 12:54 PM, Subramanian Krishnan wrote: > Hello Laurenz, > > Thank you for taking the time out to read through and problem description and responding back. > > After posting the question on the forum, I continued investigating possible root causes. Through a flash of intuition I suspected if it could be a table which is strictly not a part of the database (not listed in pg_class) which could be contributing factor. > > pg_shdepend fitted the bill. So I did a vacuum on pg_catalog.pg_shdepend and post that re-ran the DB age query. The trick worked and the age came down to match the max age entry in pg_class. To be on the safer side, I did vacuuming of pg_shdescription and pg_shseclabel as well. > > Your suggestion of: > 1) Making AV more aggressive and > 2) Tiding over the present problem of aged TOAST tables though manual/scripted VACUUM is valid and we are working on that currently. > > Thanks and Regards, > Subu > > On Mon, Mar 5, 2018 at 2:54 PM, Laurenz Albe <laurenz.albe@xxxxxxxxxxx <mailto:laurenz.albe@xxxxxxxxxxx>> wrote: > > Subramanian Krishnan wrote: > > We are running PostgreSQL 9.4.14 and recently we received an alert for transaction id > > crossing the 1 billion mark. Since we can hit the transaction id wraparound issue for aging tables, > > we investigated by running the following queries: > > > > mpsdb=> select * from txid_current(); > > txid_current > > -------------- > > 1100826671 > > (1 row) > > > > mpsdb=> SELECT datname, age(datfrozenxid) FROM pg_database ORDER BY 2 DESC LIMIT 20; > > datname | age > > -----------+------------ > > mpsdb | 1087909867 > > rdsadmin | 1000562517 > > postgres | 950697143 > > template0 | 900709601 > > template1 | 900697643 > > (5 rows) > > > > mpsdb=> SELECT c.relname as table_name, c.relkind as type, age(c.relfrozenxid) as age, c.relfrozenxid FROM pg_class AS c WHERE age(c.relfrozenxid) <> 2147483647 ORDER BY 3 DESC LIMIT 20; > > table_name | type | age | relfrozenxid > > -------------------+------+------------+-------------- > > pg_toast_22985917 | t | 1042877986 | 57941433 > > pg_toast_22986301 | t | 1042877877 | 57941542 > > pg_toast_23823161 | t | 1041791836 | 59027583 > > pg_toast_23823242 | t | 1041791776 | 59027643 > > pg_toast_23904594 | t | 1041658236 | 59161183 > > pg_toast_23904629 | t | 1041658221 | 59161198 > > pg_toast_23904640 | t | 1041658208 | 59161211 > > pg_toast_23904648 | t | 1041658200 | 59161219 > > pg_toast_23904658 | t | 1041658191 | 59161228 > > pg_toast_23904666 | t | 1041658183 | 59161236 > > pg_toast_23904674 | t | 1041658175 | 59161244 > > pg_toast_23904682 | t | 1041658169 | 59161250 > > pg_toast_23904690 | t | 1041658161 | 59161258 > > pg_toast_23904698 | t | 1041658156 | 59161263 > > pg_toast_23904706 | t | 1041658149 | 59161270 > > pg_toast_23904714 | t | 1041658129 | 59161290 > > pg_toast_23904722 | t | 1041658096 | 59161323 > > pg_toast_23904730 | t | 1041658090 | 59161329 > > pg_toast_23904738 | t | 1041658085 | 59161334 > > pg_toast_23904746 | t | 1041658077 | 59161342 > > (20 rows) > > > > > > mpsdb=> SELECT datname, usename, pid, waiting, current_timestamp - xact_start AS xact_runtime, query FROM pg_stat_activity WHERE upper(query) like '%VACUUM%' ORDER BY xact_start; > > datname | usename | pid | waiting | xact_runtime | query > > ---------+----------+-------+---------+-----------------+------------------------------------------------------------------------ > > mpsdb | rdsadmin | 13833 | f | 00:00:14.161262 | autovacuum: VACUUM pg_toast.pg_toast_242649675 (to prevent wraparound) > > mpsdb | rdsadmin | 14565 | f | 00:00:11.735571 | autovacuum: VACUUM pg_toast.pg_toast_242649757 (to prevent wraparound) > > mpsdb | rdsadmin | 15315 | f | 00:00:09.303762 | autovacuum: VACUUM pg_toast.pg_toast_242649819 (to prevent wraparound) > > mpsdb | rdsadmin | 14935 | f | 00:00:06.893078 | autovacuum: VACUUM pg_toast.pg_toast_243226657 (to prevent wraparound) > > mpsdb | rdsadmin | 15851 | f | 00:00:04.322474 | autovacuum: VACUUM pg_toast.pg_toast_243227582 (to prevent wraparound) > > mpsdb | rdsadmin | 15615 | f | 00:00:01.768495 | autovacuum: VACUUM pg_toast.pg_toast_243226332 (to prevent wraparound) > > > > It is clear that we need to vacuum the most aged pg_toast tables which are not yet picked up by the autovacuum process. > > > > The question which is bothering us though is: > > As per the queries we ran the most aged transaction and table is 1042877986 (pg_toast_22985917). > > And as per PostgreSQL pg_database documentation the most aged table/transaction in pg_class determines > > age of the database (via the datfrozenxid value). If that is the case we would have expected the age > > of mpsdb database to be 1042877986 (or in this range) but what we see as age is 1087909867. > > > > Does this mean there are other aging tables we are not aware of? If yes, how do we get to those > > tables since we have already queried pg_class for the most aged table? > > > > Request any help/guidance we can get regarding this. > > That is indeed strange. > > Reading the code, I see that at the end of a VACUUM, "datfrozenxid" is set > to the minimum value of all "relfrozenxid" for all tables in the database > obtained with a sequential scan of "pg_class". > > The only exception is that if a table is found where "relfrozenxid" is in the > future, nothing is done. The comment suggests that that should normally not > happen, but "has been known to arise due to bugs in pg_upgrade". > > - Can you find tables with negative "relfrozenxid" age in "pg_class"? > > - The other option is that a VACUUM finished between the query of "pg_database" > and the query of "pg_class". Is that an option? > > You should tune autovacuum to be more aggressive so that it gets done processing > the tables. To get rid of the immediate problem, you could schedule a manual > VACUUM of the tables to which the TOAST table belong. > > Yours, > Laurenz Albe > >