Ben Chobot wrote on 8/1/20 9:58 AM:Alvaro Herrera wrote on 8/1/20 9:35 AM: So here is me fumbling how to do this. Hopefully this is useful? The error for this host was: 2020-07-30T14:59:36.839243+00:00 host postgres[24338]: [45253-1] db=,user= WARNING: specified item offset is too large 2020-07-30T14:59:36.839307+00:00 host postgres[24338]: [45253-2] db=,user= CONTEXT: WAL redo at A0A/AC4204A0 for Btree/INSERT_LEAF: off 48 2020-07-30T14:59:36.839337+00:00 host postgres[24338]: [45254-1] db=,user= PANIC: btree_xlog_insert: failed to add item 2020-07-30T14:59:36.839366+00:00 host postgres[24338]: [45254-2] db=,user= CONTEXT: WAL redo at A0A/AC4204A0 for Btree/INSERT_LEAF: off 48 2020-07-30T14:59:37.587173+00:00 host postgres[24337]: [11-1] db=,user= LOG: startup process (PID 24338) was terminated by signal 6: Aborted 2020-07-30T14:59:37.587266+00:00 host postgres[24337]: [12-1] db=,user= LOG: terminating any other active server processes So, starting from a filesystem snapshot where the db had crashed and not yet been started, I did: /usr/lib/postgresql/12/bin/pg_waldump -p /var/lib/postgresql/12/main/pg_wal/ 0000000100000A0A000000AB 0000000100000A0A000000AD | grep A0A/AC4204A0 This shows me: rmgr: Btree len (rec/tot): 72/ 72, tx: 76396065, lsn: A0A/AC4204A0, prev A0A/AC420450, desc: INSERT_LEAF off 48, blkref #0: rel 16605/16613/60529051 blk 6501 So then I did: /usr/lib/postgresql/12/bin/pg_waldump -p /var/lib/postgresql/12/main/pg_wal/ 0000000100000A0A000000AB 0000000100000A0A000000AD | grep 16605/16613/60529051 rmgr: Btree len (rec/tot): 53/ 813, tx: 76393363, lsn: A0A/AB289778, prev A0A/AB287F30, desc: INSERT_LEAF off 26, blkref #0: rel 16605/16613/60529051 blk 6089 FPW rmgr: Btree len (rec/tot): 72/ 72, tx: 76393390, lsn: A0A/AB2BE870, prev A0A/AB2BD6E0, desc: INSERT_LEAF off 128, blkref #0: rel 16605/16613/60529051 blk 6519 rmgr: Btree len (rec/tot): 72/ 72, tx: 76393394, lsn: A0A/AB2C43D0, prev A0A/AB2C4378, desc: INSERT_LEAF off 41, blkref #0: rel 16605/16613/60529051 blk 6501 rmgr: Btree len (rec/tot): 72/ 72, tx: 76393992, lsn: A0A/AB5FA848, prev A0A/AB5FA7F0, desc: INSERT_LEAF off 133, blkref #0: rel 16605/16613/60529051 blk 5999 rmgr: Btree len (rec/tot): 72/ 72, tx: 76394938, lsn: A0A/ABE49738, prev A0A/ABE47AF8, desc: INSERT_LEAF off 189, blkref #0: rel 16605/16613/60529051 blk 6519 rmgr: Btree len (rec/tot): 72/ 72, tx: 76394977, lsn: A0A/ABEB0330, prev A0A/ABEB02D8, desc: INSERT_LEAF off 204, blkref #0: rel 16605/16613/60529051 blk 6519 rmgr: Btree len (rec/tot): 72/ 72, tx: 76396065, lsn: A0A/AC4204A0, prev A0A/AC420450, desc: INSERT_LEAF off 48, blkref #0: rel 16605/16613/60529051 blk 6501 rmgr: Btree len (rec/tot): 72/ 72, tx: 76396075, lsn: A0A/AC427518, prev A0A/AC4274C0, desc: INSERT_LEAF off 211, blkref #0: rel 16605/16613/60529051 blk 6519 rmgr: Btree len (rec/tot): 53/ 2529, tx: 76396250, lsn: A0A/AC5352F8, prev A0A/AC533A00, desc: INSERT_LEAF off 87, blkref #0: rel 16605/16613/60529051 blk 5639 FPW pg_waldump: fatal: error in WAL record at A0A/AC5411B0: invalid resource manager ID 110 at A0A/AC5411E0 ...and I have no idea what I'm looking at. I assume/hope the error at the end is due to the db shutting down, and nothing to be particularly worried about? I have the specific index that postgres tripped on here and I'll spend some time today trying to see if the other crashes were the same index. |