Adam Witney <awitney@xxxxxxxxxx> writes: > I think I have found the offending row in measured_bioassay_base... The > entry in its toast table looks like this > bugasbase2=# select chunk_id, chunk_seq, length(chunk_data) from > pg_toast.pg_toast_134401982 where chunk_id = 144391872; > chunk_id | chunk_seq | length > -----------+-----------+-------- > 144391872 | 0 | 1998 > 144391872 | 1 | 1998 > 144391872 | 2 | 1998 > 144391872 | 3 | 1998 > 144391872 | 4 | 1998 > 7625296 | 3292 | 24 > 7625297 | 3292 | 24 > 7625298 | 3292 | 24 > 7625299 | 3292 | 24 > 7625308 | 3292 | 19 This is pretty wacko, because as far as I can see there is nothing wrong with the index at all --- in particular, no gap in the chunk id/seq series here, according to pg_filedump. (The last two words of each index item are the chunk_id and chunk_seq, and we are looking at LSB-first data) Item 102 -- Length: 16 Offset: 6544 (0x1990) Flags: USED Block Id: 165390 linp Index: 4 Size: 16 Has Nulls: 0 Has Varwidths: 0 1990: 02000e86 04001000 c03e9b08 00000000 .........>...... Item 103 -- Length: 16 Offset: 6528 (0x1980) Flags: USED Block Id: 165391 linp Index: 1 Size: 16 Has Nulls: 0 Has Varwidths: 0 1980: 02000f86 01001000 c03e9b08 01000000 .........>...... Item 104 -- Length: 16 Offset: 6512 (0x1970) Flags: USED Block Id: 165391 linp Index: 2 Size: 16 Has Nulls: 0 Has Varwidths: 0 1970: 02000f86 02001000 c03e9b08 02000000 .........>...... Item 105 -- Length: 16 Offset: 6496 (0x1960) Flags: USED Block Id: 165391 linp Index: 3 Size: 16 Has Nulls: 0 Has Varwidths: 0 1960: 02000f86 03001000 c03e9b08 03000000 .........>...... Item 106 -- Length: 16 Offset: 6480 (0x1950) Flags: USED Block Id: 165391 linp Index: 4 Size: 16 Has Nulls: 0 Has Varwidths: 0 1950: 02000f86 04001000 c03e9b08 04000000 .........>...... Item 107 -- Length: 16 Offset: 6464 (0x1940) Flags: USED Block Id: 165392 linp Index: 1 Size: 16 Has Nulls: 0 Has Varwidths: 0 1940: 02001086 01001000 c03e9b08 05000000 .........>...... Item 108 -- Length: 16 Offset: 6448 (0x1930) Flags: USED Block Id: 165392 linp Index: 2 Size: 16 Has Nulls: 0 Has Varwidths: 0 1930: 02001086 02001000 c03e9b08 06000000 .........>...... Item 109 -- Length: 16 Offset: 6432 (0x1920) Flags: USED Block Id: 165392 linp Index: 3 Size: 16 Has Nulls: 0 Has Varwidths: 0 1920: 02001086 03001000 c03e9b08 07000000 .........>...... Item 110 -- Length: 16 Offset: 6416 (0x1910) Flags: USED Block Id: 165392 linp Index: 4 Size: 16 Has Nulls: 0 Has Varwidths: 0 1910: 02001086 04001000 c03e9b08 08000000 .........>...... What I think is going on is that you have a corrupted copy of this index page sitting in shared buffers. It's not been marked dirty, else it'd have been written out by a checkpoint and we could see the damage with pg_filedump. This implies that the problem would go away by itself if you simply shut down and restarted the database. However, since you've been so helpful so far ;-) I wonder whether there is any chance of getting an image of the bad page in memory before you do that. We are presumably looking at the results of an unintentional memory stomp somewhere in the code, and it's possible that a dump of the damaged page would give a clue how it happened. The only simple way I can think of for getting that info involves gdb, but I don't suppose you are using a copy built with --enable-debug ... anyone have other ideas? regards, tom lane