Rural Hunter <ruralhunter@xxxxxxxxx> wrote: > Let me put here the whole scenario: > 1. I was called by our application users that all the updating was > failing. So I went to check the db. Any update transaction including > manual vacuum is blocked out by the error message: > ERROR: database is not accepting commands to avoid wraparound data loss > in database "db1" > Suggestion:Stop the postmaster and use a standalone backend to > vacuum that database. > > 2. Since db1 is a very large database(it is the main db the user is > using) I can not afford to take long time to vacuum full on that. So I > thought about to try on other small dbs first. Why in the world would you want to use VACUUM FULL in this circumstance? > 3. I stop the instance. > > 4. I use "echo 'vacuum full;' | postgres --single -D /pgdata [other > dbs]" to vacuum some other dbs. I still got several warning messages > when vacuum the first database(let's say db2): > 2013-09-14 08:56:44 CST [5536]: [453-1] user=,db=,host= WARNING: > database "db1" must be vacuumed within 999775 transactions > 2013-09-14 08:56:44 CST [5536]: [454-1] user=,db=,host= HINT: To avoid > a database shutdown, execute a database-wide VACUUM in that database. WARNING, not error, so the VACUUM would have run. > Here the error message still points to db1. I'm not sure which database would be referenced if the table which needed the VACUUM was a shared table, like pg_database or pg_authid. > 5. When I ran the single connection vacuum on other dbs(not db1), there > was not any error/warning message. So I tried to start whole instance. > > 6. I started the instance and found everything is fine. > > So actually I have 3 questions here: > 1. Was the db name in the error message wrong? Probably not, to the extent that running VACUUM (FULL is not necessary) against that database would have solved the problem. If it was a shared catalog table it might be that it was not the *only* database which would work. > 2. How would that happend? Shouldn't auto vacuum handle it and avoid > such problem? There are two possibilities -- either you had a long-running transaction in the cluster or your autovacuum is not configured to be aggressive enough to keep you out of trouble. > 3. How to detect such problem earlier? We would need a description of the machine (cores, RAM, storage system) and the output of these queries to be able to make good suggestions on tuning autovacuum: SELECT version(); SELECT name, current_setting(name), source FROM pg_settings WHERE source NOT IN ('default', 'override'); In addition, make sure that you are monitoring for long-running transactions. A reasonable monitoring scheme might be to alert when either of these queries returns any rows: select * from pg_stat_activity where xact_start < (now() - interval '1 hour'); select * from pg_prepared_xacts where prepared < (now() - interval '1 minute'); You can, of course, adjust the intervals to what makes the most sense for your environment. If you have max_prepared_transactions set to zero, the latter query is not really necessary. -- Kevin Grittner EDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-admin