Error on vacuum: xmin before relfrozenxid

Paolo Crosato <paolo.crosato@xxxxxxxxx> · Tue, 22 May 2018 16:18:20 +0200

Hi,
we have an error happening on a catalog table on one of the dbs in the instance. 
This is the error report form:

A description of what you are trying to achieve and what results you expect.:

Any kind of vacuum fails on pg_authid table, I would expect it to succeed. This is occasionaly blocking autovacuums, so we must resort to manual vacuum of busy tables, which is bearable but inconvenient.

PostgreSQL version number you are running:

PostgreSQL 10.4 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit

How you installed PostgreSQL:

From the pgdg yum repositories.

Changes made to the settings in the postgresql.conf file:  see Server Configuration for a quick way to list them all.

             name             |                               current_setting                                |        source
------------------------------+------------------------------------------------------------------------------+----------------------
 application_name             | psql                                                                         | client
 archive_command              | rsync -a %p barman@x.x.x.x:/data/backup/barman/ubipgsql-10/incoming/%f | configuration file
 archive_mode                 | on                                                                           | configuration file
 checkpoint_completion_target | 0.9                                                                          | configuration file
 checkpoint_timeout           | 25min                                                                        | configuration file
 client_encoding              | UTF8                                                                         | client
 DateStyle                    | ISO, MDY                                                                     | configuration file
 default_text_search_config   | pg_catalog.english                                                           | configuration file
 dynamic_shared_memory_type   | posix                                                                        | configuration file
 effective_cache_size         | 64GB                                                                         | configuration file
 hot_standby                  | on                                                                           | configuration file
 hot_standby_feedback         | on                                                                           | configuration file
 lc_messages                  | en_US.UTF-8                                                                  | configuration file
 lc_monetary                  | en_US.UTF-8                                                                  | configuration file
 lc_numeric                   | en_US.UTF-8                                                                  | configuration file
 lc_time                      | en_US.UTF-8                                                                  | configuration file
 listen_addresses             | *                                                                            | configuration file
 log_autovacuum_min_duration  | 1s                                                                           | configuration file
 log_checkpoints              | on                                                                           | configuration file
 log_destination              | syslog                                                                       | configuration file
 log_directory                | log                                                                          | configuration file
 log_filename                 | postgresql-%a.log                                                            | configuration file
 log_line_prefix              | user=%u,db=%d,client=%h                                                      | configuration file
 log_lock_waits               | on                                                                           | configuration file
 log_min_duration_statement   | 1s                                                                           | configuration file
 log_min_messages             | info                                                                         | configuration file
 log_rotation_age             | 1d                                                                           | configuration file
 log_rotation_size            | 0                                                                            | configuration file
 log_statement                | ddl                                                                          | configuration file
 log_temp_files               | 1MB                                                                          | configuration file
 log_timezone                 | Europe/Rome                                                                  | configuration file
 log_truncate_on_rotation     | on                                                                           | configuration file
 logging_collector            | on                                                                           | configuration file
 maintenance_work_mem         | 1GB                                                                          | configuration file
 max_connections              | 1000                                                                         | configuration file
 max_stack_depth              | 2MB                                                                          | environment variable
 max_wal_size                 | 12GB                                                                         | configuration file
 min_wal_size                 | 80MB                                                                         | configuration file
 password_encryption          | scram-sha-256                                                                | configuration file
 pg_stat_statements.max       | 10000                                                                        | configuration file
 pg_stat_statements.track     | all                                                                          | configuration file
 shared_buffers               | 32GB                                                                         | configuration file
 shared_preload_libraries     | pg_stat_statements                                                           | configuration file
 syslog_facility              | local0                                                                       | configuration file
 syslog_ident                 | postgres                                                                     | configuration file
 TimeZone                     | Europe/Rome                                                                  | configuration file
 track_activity_query_size    | 2048                                                                         | configuration file
 track_functions              | all                                                                          | configuration file
 track_io_timing              | on                                                                           | configuration file
 wal_keep_segments            | 200                                                                          | configuration file
 wal_level                    | logical                                                                      | configuration file
(51 rows)

Operating system and version:

CentOS Linux release 7.4.1708 (Core)
Linux xx.xx.com 3.10.0-693.11.1.el7.x86_64 #1 SMP Mon Dec 4 23:52:40 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux

What program you're using to connect to PostgreSQL:

psql, jdbc driver. It's not an application side error.

Is there anything relevant or unusual in the PostgreSQL server logs?:

Yes, we see these kind of errors:

May 20 16:06:04 ubipgsql01 postgres[26739]: [1380-1] user=,db=,client= ERROR:  found xmin 2889675859 from before relfrozenxid 400011439
May 20 16:06:04 ubipgsql01 postgres[26739]: [1380-2] user=,db=,client= CONTEXT:  automatic vacuum of table "postgres.pg_catalog.pg_authid"

When these errors pop up, autovacuum repeatedly fails on this very same table, and we have to resort to manual vacuums

For questions about any kind of error:
What you were doing when the error happened / how to cause the error:

The first time we saw the error, we found an idle transaction was left open for several days. The user was not monitored for long running transactions. We killed the session, deleted the offending row (the one with the xmin value reported) and then vacuumed the table.
However, teh same error pops up every time we try to add a new user. We would like to fix these error once and for all, so we can add new users.

The EXACT TEXT of the error message you're getting, if there is one: (Copy and paste the message to the email, do not send a screenshot)

This is the error: 

psql (10.4)
Type "help" for help.

postgres=# vacuum pg_authid;
ERROR:  found xmin 3031994631 from before relfrozenxid 400011439

Some datas:

postgres=# select xmin from pg_authid ;
    xmin
------------
          1
          1
          1
          1
          1
        557
    7216348
  110077819
  110511334
 3031994631
 3032044199
 3032044199
 3032044199
 3032070282
(14 rows)

postgres=# select relfrozenxid from pg_class where relname='pg_authid';
 relfrozenxid
--------------
    400011439
(1 row)

postgres=#

Is this a sympthom of data corruption or transaction wraparound due to the long running transaction that we killed weeks ago? This is the only table in the whole cluster that has this error. We are monitoring transactions wraparound with the the check_postgres.pl script, the check is still running fine and no alert was given at all since the cluster has been running.

Thank you in advance for any answer.

Paolo Crosato