MultiXactId xxxxxxxxx has not been created yet -- apparent wraparound

David M <sircement@xxxxxxxxxxx> · Tue, 14 Jun 2016 05:39:24 +0000

Hey, having issues with 'MultiXactId xxxxxxxxx has not been created yet -- apparent wraparound' log messages upon table vacuums. The vacuums are cancelled by the system after this message is issued.

These messages have only been seen since upgrading from 9.2.x to 9.4.7 via pg_upgrade about a month ago. This is a large / high transaction rate (1120 tps) database. We are unable to vacuum the affected tables (5 of them) currently due to this issue. 

In pg_multixact/members we have 34878 files (0000 through 883D).
In pg_multixact/offsets we have 4833 files (A369 through B649).
The files from members and offsets all span the same time range (starting from pg_upgrade time) to present.

We have multiple databases (4 of them in total) which all show the same behavior where the members and offsets files are concerned; ie: they are stacking up from the time the upgrade to 9.4 was performed. 

Would love to get some feedback / 'where to next' advice for this issue but also have some direct questions:
1) Are there any manual activities that must be carried out to ensure safety around multi-transactions or should autovacuum be enough to take care of any required maintenance ?
2) Should we expect to see the contents of the pg_multixact directories members and offsets grow from the point of upgrade (forever !?) or should these be getting cleaned up on a regular basis by some process manual or otherwise that is not working for
 us ?
3) Given the messages is it guarenteed that we have lost or corrupted data or could the log messages be misleading ?
4) What is the best course of action from here to fix the tables that we are having issues with ? (have already experimented with 'SELECT * FROM tablename FOR UPDATE' in a scratch system successfully allowing us to vacuum freeze the table)
5) What should be put in place to ensure we do not have a repeat of this ?
6) How can we establish how this situation has come about ?

Our pg_controldata output is as below:
pg_control version number:            942
Catalog version number:               201409291
Database system identifier:           6286682415590481708
Database cluster state:               in production
pg_control last modified:             Tue 14 Jun 2016 05:05:57 AM UTC
Latest checkpoint location:           BCE7/19AB0390
Prior checkpoint location:            BCE6/F9871CE8
Latest checkpoint's REDO location:    BCE7/9008718
Latest checkpoint's REDO WAL file:    000000010000BCE700000009
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          3/2834094405
Latest checkpoint's NextOID:          93739782
Latest checkpoint's NextMultiXactId:  3097824035
Latest checkpoint's NextMultiOffset:  2063874208
Latest checkpoint's oldestXID:        2173560081
Latest checkpoint's oldestXID's DB:   16934
Latest checkpoint's oldestActiveXID:  2834094386
Latest checkpoint's oldestMultiXid:   2741572175
Latest checkpoint's oldestMulti's DB: 16934
Time of latest checkpoint:            Tue 14 Jun 2016 05:05:16 AM UTC
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     0/0
Min recovery ending loc's timeline:   0
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
Current wal_level setting:            hot_standby
Current wal_log_hints setting:        off
Current max_connections setting:      1000
Current max_worker_processes setting: 8
Current max_prepared_xacts setting:   1000
Current max_locks_per_xact setting:   64
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                16777216
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           0