Pavan writes: >> FWIW one of our support customers reported a very similar TOAST table >> corruption issue last week which nearly caused an outage. After a lot of >> analysis, I think I've now fully understood the reasons behind the >> corruption, the underlying bug(s) and possible remedy. I am currently >> working on writing a reproducible test case to demonstrate the problem and >> writing the fix. More details on that soon. > > I've posted a reproducer and a proposed fix to -hackers [1] > > In the particular case that I investigated, a database crash/recovery was > involved. But I think we should be able to create a scenario where OID > wrap-around or a standby promotion triggers the problem. I don't know if > any of that was involved in the cases reported on this thread, but I've a > strong suspicion that the underlying bug is probably the same. In our case there was no crash+recovery, but we do have a high write load (and the problem occurring quite seldom), so it sounds like it fits. > [1] https://www.postgresql.org/message-id/CABOikdOgWT2hHkYG3Wwo2cyZJq2zfs1FH0FgX-%3Dh4OLosXHf9w%40mail.gmail.com Impressive investigation! Just a couple of data points: your script to reproduce the problem does so for all three versions of PostgreSQL we have in use: 9.3.22, 9.6.8 and 10.3. And I specifically tested on our production machine which exhibits the problem, running 9.3.22, and your script reproduced the problem there as well. I.e. ending with: ERROR: unexpected chunk number 0 (expected 1) for toast value 16396 in pg_toast_16384 REINDEX ERROR: unexpected chunk number 0 (expected 1) for toast value 16396 in pg_toast_16384 VACUUM ERROR: unexpected chunk number 0 (expected 1) for toast value 16396 in pg_toast_16384 Best regards, Adam -- "No more than that, but very powerful all the Adam Sjøgren same; simple things are good." adsj@xxxxxxxxxxxxx